Commit | Line | Data |
---|---|---|
6bf7aab6 | 1 | @c This is part of the Emacs manual. |
a63ebf98 | 2 | @c Copyright (C) 1985, 86, 87, 93, 94, 95, 97, 2000, 2001 |
6ca0edfe | 3 | @c Free Software Foundation, Inc. |
6bf7aab6 DL |
4 | @c See file emacs.texi for copying conditions. |
5 | @node Search, Fixit, Display, Top | |
6 | @chapter Searching and Replacement | |
7 | @cindex searching | |
8 | @cindex finding strings within text | |
9 | ||
10 | Like other editors, Emacs has commands for searching for occurrences of | |
11 | a string. The principal search command is unusual in that it is | |
12 | @dfn{incremental}; it begins to search before you have finished typing the | |
13 | search string. There are also nonincremental search commands more like | |
14 | those of other editors. | |
15 | ||
16 | Besides the usual @code{replace-string} command that finds all | |
a76af65d RS |
17 | occurrences of one string and replaces them with another, Emacs has a |
18 | more flexible replacement command called @code{query-replace}, which | |
19 | asks interactively which occurrences to replace. | |
6bf7aab6 DL |
20 | |
21 | @menu | |
a57bfc9f EZ |
22 | * Incremental Search:: Search happens as you type the string. |
23 | * Nonincremental Search:: Specify entire string and then search. | |
24 | * Word Search:: Search for sequence of words. | |
25 | * Regexp Search:: Search for match for a regexp. | |
26 | * Regexps:: Syntax of regular expressions. | |
27 | * Search Case:: To ignore case while searching, or not. | |
28 | * Configuring Scrolling:: Scrolling within incremental search. | |
29 | * Replace:: Search, and replace some or all matches. | |
30 | * Other Repeating Search:: Operating on all matches for some regexp. | |
6bf7aab6 DL |
31 | @end menu |
32 | ||
33 | @node Incremental Search, Nonincremental Search, Search, Search | |
34 | @section Incremental Search | |
35 | ||
36 | @cindex incremental search | |
37 | An incremental search begins searching as soon as you type the first | |
38 | character of the search string. As you type in the search string, Emacs | |
39 | shows you where the string (as you have typed it so far) would be | |
40 | found. When you have typed enough characters to identify the place you | |
41 | want, you can stop. Depending on what you plan to do next, you may or | |
42 | may not need to terminate the search explicitly with @key{RET}. | |
43 | ||
44 | @c WideCommands | |
45 | @table @kbd | |
46 | @item C-s | |
47 | Incremental search forward (@code{isearch-forward}). | |
48 | @item C-r | |
49 | Incremental search backward (@code{isearch-backward}). | |
50 | @end table | |
51 | ||
52 | @kindex C-s | |
53 | @findex isearch-forward | |
a76af65d RS |
54 | @kbd{C-s} starts a forward incremental search. It reads characters |
55 | from the keyboard, and moves point past the next occurrence of those | |
56 | characters. If you type @kbd{C-s} and then @kbd{F}, that puts the | |
e9c80604 | 57 | cursor after the first @samp{F} (the first following the starting point, since |
a76af65d RS |
58 | this is a forward search). Then if you type an @kbd{O}, you will see |
59 | the cursor move just after the first @samp{FO} (the @samp{F} in that | |
60 | @samp{FO} may or may not be the first @samp{F}). After another | |
61 | @kbd{O}, the cursor moves after the first @samp{FOO} after the place | |
62 | where you started the search. At each step, the buffer text that | |
63 | matches the search string is highlighted, if the terminal can do that; | |
64 | the current search string is always displayed in the echo area. | |
6bf7aab6 DL |
65 | |
66 | If you make a mistake in typing the search string, you can cancel | |
67 | characters with @key{DEL}. Each @key{DEL} cancels the last character of | |
68 | search string. This does not happen until Emacs is ready to read another | |
69 | input character; first it must either find, or fail to find, the character | |
70 | you want to erase. If you do not want to wait for this to happen, use | |
71 | @kbd{C-g} as described below. | |
72 | ||
73 | When you are satisfied with the place you have reached, you can type | |
74 | @key{RET}, which stops searching, leaving the cursor where the search | |
75 | brought it. Also, any command not specially meaningful in searches | |
b20a1c88 RS |
76 | stops the searching and is then executed. Thus, typing @kbd{C-a} |
77 | would exit the search and then move to the beginning of the line. | |
78 | @key{RET} is necessary only if the next command you want to type is a | |
79 | printing character, @key{DEL}, @key{RET}, or another character that is | |
6bf7aab6 | 80 | special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s}, |
b20a1c88 RS |
81 | @kbd{C-y}, @kbd{M-y}, @kbd{M-r}, @kbd{M-s}, and some other |
82 | meta-characters). | |
6bf7aab6 | 83 | |
a76af65d | 84 | Sometimes you search for @samp{FOO} and find one, but not the one you |
91cf1909 RS |
85 | expected to find. There was a second @samp{FOO} that you forgot |
86 | about, before the one you were aiming for. In this event, type | |
87 | another @kbd{C-s} to move to the next occurrence of the search string. | |
88 | You can repeat this any number of times. If you overshoot, you can | |
89 | cancel some @kbd{C-s} characters with @key{DEL}. | |
6bf7aab6 DL |
90 | |
91 | After you exit a search, you can search for the same string again by | |
92 | typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes | |
93 | incremental search, and the second @kbd{C-s} means ``search again.'' | |
94 | ||
95 | To reuse earlier search strings, use the @dfn{search ring}. The | |
96 | commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search | |
97 | string to reuse. These commands leave the selected search ring element | |
98 | in the minibuffer, where you can edit it. Type @kbd{C-s} or @kbd{C-r} | |
99 | to terminate editing the string and search for it. | |
100 | ||
101 | If your string is not found at all, the echo area says @samp{Failing | |
102 | I-Search}. The cursor is after the place where Emacs found as much of your | |
103 | string as it could. Thus, if you search for @samp{FOOT}, and there is no | |
104 | @samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}. | |
105 | At this point there are several things you can do. If your string was | |
106 | mistyped, you can rub some of it out and correct it. If you like the place | |
107 | you have found, you can type @key{RET} or some other Emacs command to | |
a76af65d | 108 | remain there. Or you can type @kbd{C-g}, which |
6bf7aab6 DL |
109 | removes from the search string the characters that could not be found (the |
110 | @samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in | |
111 | @samp{FOOT}). A second @kbd{C-g} at that point cancels the search | |
112 | entirely, returning point to where it was when the search started. | |
113 | ||
114 | An upper-case letter in the search string makes the search | |
115 | case-sensitive. If you delete the upper-case character from the search | |
116 | string, it ceases to have this effect. @xref{Search Case}. | |
117 | ||
91cf1909 RS |
118 | To search for a newline, type @kbd{C-j}. To search for another |
119 | control character, such as control-S or carriage return, you must quote | |
120 | it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous | |
121 | to its use for insertion (@pxref{Inserting Text}): it causes the | |
122 | following character to be treated the way any ``ordinary'' character is | |
123 | treated in the same context. You can also specify a character by its | |
124 | octal code: enter @kbd{C-q} followed by a sequence of octal digits. | |
125 | ||
76dd3692 | 126 | @cindex searching for non-@acronym{ASCII} characters |
4a1b539b | 127 | @cindex input method, during incremental search |
76dd3692 | 128 | To search for non-@acronym{ASCII} characters, you must use an input method |
e9c80604 | 129 | (@pxref{Input Methods}). If an input method is enabled in the |
91cf1909 RS |
130 | current buffer when you start the search, you can use it while you |
131 | type the search string also. Emacs indicates that by including the | |
132 | input method mnemonic in its prompt, like this: | |
4a1b539b EZ |
133 | |
134 | @example | |
91cf1909 | 135 | I-search [@var{im}]: |
4a1b539b EZ |
136 | @end example |
137 | ||
138 | @noindent | |
139 | @findex isearch-toggle-input-method | |
140 | @findex isearch-toggle-specified-input-method | |
141 | where @var{im} is the mnemonic of the active input method. You can | |
142 | toggle (enable or disable) the input method while you type the search | |
143 | string with @kbd{C-\} (@code{isearch-toggle-input-method}). You can | |
144 | turn on a certain (non-default) input method with @kbd{C-^} | |
145 | (@code{isearch-toggle-specified-input-method}), which prompts for the | |
e9c80604 RS |
146 | name of the input method. The input method you enable during |
147 | incremental search remains enabled in the current buffer afterwards. | |
4a1b539b | 148 | |
6bf7aab6 | 149 | If a search is failing and you ask to repeat it by typing another |
91cf1909 RS |
150 | @kbd{C-s}, it starts again from the beginning of the buffer. |
151 | Repeating a failing reverse search with @kbd{C-r} starts again from | |
152 | the end. This is called @dfn{wrapping around}, and @samp{Wrapped} | |
153 | appears in the search prompt once this has happened. If you keep on | |
154 | going past the original starting point of the search, it changes to | |
155 | @samp{Overwrapped}, which means that you are revisiting matches that | |
156 | you have already seen. | |
6bf7aab6 DL |
157 | |
158 | @cindex quitting (in search) | |
159 | The @kbd{C-g} ``quit'' character does special things during searches; | |
160 | just what it does depends on the status of the search. If the search has | |
161 | found what you specified and is waiting for input, @kbd{C-g} cancels the | |
162 | entire search. The cursor moves back to where you started the search. If | |
163 | @kbd{C-g} is typed when there are characters in the search string that have | |
164 | not been found---because Emacs is still searching for them, or because it | |
165 | has failed to find them---then the search string characters which have not | |
166 | been found are discarded from the search string. With them gone, the | |
167 | search is now successful and waiting for more input, so a second @kbd{C-g} | |
168 | will cancel the entire search. | |
169 | ||
6bf7aab6 DL |
170 | You can change to searching backwards with @kbd{C-r}. If a search fails |
171 | because the place you started was too late in the file, you should do this. | |
172 | Repeated @kbd{C-r} keeps looking for more occurrences backwards. A | |
173 | @kbd{C-s} starts going forwards again. @kbd{C-r} in a search can be canceled | |
174 | with @key{DEL}. | |
175 | ||
176 | @kindex C-r | |
177 | @findex isearch-backward | |
178 | If you know initially that you want to search backwards, you can use | |
179 | @kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r} as | |
180 | a key runs a command (@code{isearch-backward}) to search backward. A | |
181 | backward search finds matches that are entirely before the starting | |
182 | point, just as a forward search finds matches that begin after it. | |
183 | ||
184 | The characters @kbd{C-y} and @kbd{C-w} can be used in incremental | |
1e9ec40b RS |
185 | search to grab text from the buffer into the search string. This |
186 | makes it convenient to search for another occurrence of text at point. | |
187 | @kbd{C-w} copies the character or word after point as part of the | |
188 | search string, advancing point over it. (The decision, whether to | |
189 | copy a character or a word, is heuristic.) Another @kbd{C-s} to | |
190 | repeat the search will then search for a string including that | |
191 | character or word. | |
192 | ||
193 | @kbd{C-y} is similar to @kbd{C-w} but copies all the rest of the | |
194 | current line into the search string. Both @kbd{C-y} and @kbd{C-w} | |
195 | convert the text they copy to lower case if the search is currently | |
196 | not case-sensitive; this is so the search remains case-insensitive. | |
6bf7aab6 DL |
197 | |
198 | The character @kbd{M-y} copies text from the kill ring into the search | |
199 | string. It uses the same text that @kbd{C-y} as a command would yank. | |
91cf1909 | 200 | @kbd{Mouse-2} in the echo area does the same. |
6bf7aab6 DL |
201 | @xref{Yanking}. |
202 | ||
203 | When you exit the incremental search, it sets the mark to where point | |
204 | @emph{was}, before the search. That is convenient for moving back | |
205 | there. In Transient Mark mode, incremental search sets the mark without | |
206 | activating it, and does so only if the mark is not already active. | |
207 | ||
1de69f0c | 208 | @cindex lazy search highlighting |
1de69f0c | 209 | @vindex isearch-lazy-highlight |
91cf1909 RS |
210 | When you pause for a little while during incremental search, it |
211 | highlights all other possible matches for the search string. This | |
212 | makes it easier to anticipate where you can get to by typing @kbd{C-s} | |
213 | or @kbd{C-r} to repeat the search. The short delay before highlighting | |
214 | other matches helps indicate which match is the current one. | |
215 | If you don't like this feature, you can turn it off by setting | |
216 | @code{isearch-lazy-highlight} to @code{nil}. | |
1de69f0c | 217 | |
24346b4e EZ |
218 | @vindex isearch-lazy-highlight-face |
219 | @cindex faces for highlighting search matches | |
6f515f89 EZ |
220 | You can control how this highlighting looks by customizing the faces |
221 | @code{isearch} (used for the current match) and | |
222 | @code{isearch-lazy-highlight-face} (for all the other matches). | |
24346b4e | 223 | |
6bf7aab6 DL |
224 | @vindex isearch-mode-map |
225 | To customize the special characters that incremental search understands, | |
226 | alter their bindings in the keymap @code{isearch-mode-map}. For a list | |
227 | of bindings, look at the documentation of @code{isearch-mode} with | |
228 | @kbd{C-h f isearch-mode @key{RET}}. | |
229 | ||
a57bfc9f EZ |
230 | @subsection Scrolling During Incremental Search |
231 | ||
232 | Vertical scrolling during incremental search can be enabled by | |
233 | setting the customizable variable @code{isearch-allow-scroll} to a | |
234 | non-nil value. | |
235 | ||
236 | You can then use the vertical scroll-bar or certain keyboard | |
237 | commands such as @kbd{@key{PRIOR}} (@code{scroll-down}), | |
238 | @kbd{@key{NEXT}} (@code{scroll-up}) and @kbd{C-l} (@code{recenter}) | |
239 | within the search, thus letting you see more of the text near the | |
240 | current match. You must run these commands via their key sequences to | |
241 | stay in the search - typing M-x @var{comand-name} will always | |
242 | terminate a search. | |
243 | ||
244 | You can give prefix arguments to these commands in the usual way. | |
245 | The current match cannot be scrolled out of the window - this is | |
246 | intentional. | |
247 | ||
248 | Several other commands, such as @kbd{C-x 2} | |
249 | (@code{split-window-vertically}) and @kbd{C-x ^} | |
250 | (@code{enlarge-window}) which don't scroll the window, are | |
251 | nevertheless made available under this rubric, since they are likewise | |
252 | handy during a search. | |
253 | ||
254 | For a list of commands which are configured as scrolling commands by | |
255 | default and instructions on how thus to configure other commands, see | |
256 | @ref{Configuring Scrolling}. | |
257 | ||
6bf7aab6 DL |
258 | @subsection Slow Terminal Incremental Search |
259 | ||
260 | Incremental search on a slow terminal uses a modified style of display | |
261 | that is designed to take less time. Instead of redisplaying the buffer at | |
262 | each place the search gets to, it creates a new single-line window and uses | |
263 | that to display the line that the search has found. The single-line window | |
6f515f89 | 264 | comes into play as soon as point moves outside of the text that is already |
6bf7aab6 DL |
265 | on the screen. |
266 | ||
267 | When you terminate the search, the single-line window is removed. | |
6f515f89 | 268 | Emacs then redisplays the window in which the search was done, to show |
6bf7aab6 DL |
269 | its new position of point. |
270 | ||
6bf7aab6 DL |
271 | @vindex search-slow-speed |
272 | The slow terminal style of display is used when the terminal baud rate is | |
273 | less than or equal to the value of the variable @code{search-slow-speed}, | |
b20a1c88 | 274 | initially 1200. See @code{baud-rate} in @ref{Display Custom}. |
6bf7aab6 DL |
275 | |
276 | @vindex search-slow-window-lines | |
277 | The number of lines to use in slow terminal search display is controlled | |
278 | by the variable @code{search-slow-window-lines}. Its normal value is 1. | |
279 | ||
280 | @node Nonincremental Search, Word Search, Incremental Search, Search | |
281 | @section Nonincremental Search | |
282 | @cindex nonincremental search | |
283 | ||
284 | Emacs also has conventional nonincremental search commands, which require | |
285 | you to type the entire search string before searching begins. | |
286 | ||
287 | @table @kbd | |
288 | @item C-s @key{RET} @var{string} @key{RET} | |
289 | Search for @var{string}. | |
290 | @item C-r @key{RET} @var{string} @key{RET} | |
291 | Search backward for @var{string}. | |
292 | @end table | |
293 | ||
294 | To do a nonincremental search, first type @kbd{C-s @key{RET}}. This | |
295 | enters the minibuffer to read the search string; terminate the string | |
296 | with @key{RET}, and then the search takes place. If the string is not | |
a76af65d RS |
297 | found, the search command signals an error. |
298 | ||
299 | When you type @kbd{C-s @key{RET}}, the @kbd{C-s} invokes incremental | |
300 | search as usual. That command is specially programmed to invoke | |
301 | nonincremental search, @code{search-forward}, if the string you | |
302 | specify is empty. (Such an empty argument would otherwise be | |
303 | useless.) But it does not call @code{search-forward} right away. First | |
304 | it checks the next input character to see if is @kbd{C-w}, | |
305 | which specifies a word search. | |
6bf7aab6 DL |
306 | @ifinfo |
307 | @xref{Word Search}. | |
308 | @end ifinfo | |
a76af65d | 309 | @kbd{C-r @key{RET}} does likewise, for a reverse incremental search. |
6bf7aab6 DL |
310 | |
311 | @findex search-forward | |
312 | @findex search-backward | |
313 | Forward and backward nonincremental searches are implemented by the | |
314 | commands @code{search-forward} and @code{search-backward}. These | |
315 | commands may be bound to keys in the usual manner. The feature that you | |
316 | can get to them via the incremental search commands exists for | |
a76af65d | 317 | historical reasons, and to avoid the need to find key sequences |
6bf7aab6 DL |
318 | for them. |
319 | ||
320 | @node Word Search, Regexp Search, Nonincremental Search, Search | |
321 | @section Word Search | |
322 | @cindex word search | |
323 | ||
324 | Word search searches for a sequence of words without regard to how the | |
325 | words are separated. More precisely, you type a string of many words, | |
6f515f89 EZ |
326 | using single spaces to separate them, and the string can be found even |
327 | if there are multiple spaces, newlines, or other punctuation characters | |
328 | between these words. | |
6bf7aab6 DL |
329 | |
330 | Word search is useful for editing a printed document made with a text | |
331 | formatter. If you edit while looking at the printed, formatted version, | |
332 | you can't tell where the line breaks are in the source file. With word | |
333 | search, you can search without having to know them. | |
334 | ||
335 | @table @kbd | |
336 | @item C-s @key{RET} C-w @var{words} @key{RET} | |
337 | Search for @var{words}, ignoring details of punctuation. | |
338 | @item C-r @key{RET} C-w @var{words} @key{RET} | |
339 | Search backward for @var{words}, ignoring details of punctuation. | |
340 | @end table | |
341 | ||
342 | Word search is a special case of nonincremental search and is invoked | |
343 | with @kbd{C-s @key{RET} C-w}. This is followed by the search string, | |
344 | which must always be terminated with @key{RET}. Being nonincremental, | |
345 | this search does not start until the argument is terminated. It works | |
346 | by constructing a regular expression and searching for that; see | |
347 | @ref{Regexp Search}. | |
348 | ||
349 | Use @kbd{C-r @key{RET} C-w} to do backward word search. | |
350 | ||
351 | @findex word-search-forward | |
352 | @findex word-search-backward | |
353 | Forward and backward word searches are implemented by the commands | |
354 | @code{word-search-forward} and @code{word-search-backward}. These | |
a76af65d RS |
355 | commands may be bound to keys in the usual manner. They are available |
356 | via the incremental search commands both for historical reasons and | |
357 | to avoid the need to find suitable key sequences for them. | |
6bf7aab6 DL |
358 | |
359 | @node Regexp Search, Regexps, Word Search, Search | |
360 | @section Regular Expression Search | |
361 | @cindex regular expression | |
362 | @cindex regexp | |
363 | ||
a76af65d RS |
364 | A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern |
365 | that denotes a class of alternative strings to match, possibly | |
366 | infinitely many. GNU Emacs provides both incremental and | |
367 | nonincremental ways to search for a match for a regexp. | |
6bf7aab6 DL |
368 | |
369 | @kindex C-M-s | |
370 | @findex isearch-forward-regexp | |
371 | @kindex C-M-r | |
372 | @findex isearch-backward-regexp | |
373 | Incremental search for a regexp is done by typing @kbd{C-M-s} | |
a76af65d RS |
374 | (@code{isearch-forward-regexp}), or by invoking @kbd{C-s} with a |
375 | prefix argument (whose value does not matter). This command reads a | |
376 | search string incrementally just like @kbd{C-s}, but it treats the | |
377 | search string as a regexp rather than looking for an exact match | |
378 | against the text in the buffer. Each time you add text to the search | |
379 | string, you make the regexp longer, and the new regexp is searched | |
380 | for. To search backward for a regexp, use @kbd{C-M-r} | |
381 | (@code{isearch-backward-regexp}), or @kbd{C-r} with a prefix argument. | |
6bf7aab6 DL |
382 | |
383 | All of the control characters that do special things within an | |
384 | ordinary incremental search have the same function in incremental regexp | |
385 | search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the | |
386 | search retrieves the last incremental search regexp used; that is to | |
387 | say, incremental regexp and non-regexp searches have independent | |
388 | defaults. They also have separate search rings that you can access with | |
389 | @kbd{M-p} and @kbd{M-n}. | |
390 | ||
391 | If you type @key{SPC} in incremental regexp search, it matches any | |
392 | sequence of whitespace characters, including newlines. If you want | |
393 | to match just a space, type @kbd{C-q @key{SPC}}. | |
394 | ||
395 | Note that adding characters to the regexp in an incremental regexp | |
396 | search can make the cursor move back and start again. For example, if | |
397 | you have searched for @samp{foo} and you add @samp{\|bar}, the cursor | |
398 | backs up in case the first @samp{bar} precedes the first @samp{foo}. | |
399 | ||
400 | @findex re-search-forward | |
401 | @findex re-search-backward | |
402 | Nonincremental search for a regexp is done by the functions | |
403 | @code{re-search-forward} and @code{re-search-backward}. You can invoke | |
404 | these with @kbd{M-x}, or bind them to keys, or invoke them by way of | |
405 | incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r | |
406 | @key{RET}}. | |
407 | ||
408 | If you use the incremental regexp search commands with a prefix | |
409 | argument, they perform ordinary string search, like | |
410 | @code{isearch-forward} and @code{isearch-backward}. @xref{Incremental | |
411 | Search}. | |
412 | ||
413 | @node Regexps, Search Case, Regexp Search, Search | |
414 | @section Syntax of Regular Expressions | |
4946337d | 415 | @cindex syntax of regexps |
6bf7aab6 | 416 | |
872d74eb RS |
417 | This manual describes regular expression features that users |
418 | typically want to use. There are additional features that are | |
419 | mainly used in Lisp programs; see @ref{Regular Expressions,,, | |
420 | elisp, the same manual}. | |
421 | ||
6bf7aab6 DL |
422 | Regular expressions have a syntax in which a few characters are |
423 | special constructs and the rest are @dfn{ordinary}. An ordinary | |
424 | character is a simple regular expression which matches that same | |
425 | character and nothing else. The special characters are @samp{$}, | |
426 | @samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{]} and | |
427 | @samp{\}. Any other character appearing in a regular expression is | |
96e29ebe EZ |
428 | ordinary, unless a @samp{\} precedes it. (When you use regular |
429 | expressions in a Lisp program, each @samp{\} must be doubled, see the | |
430 | example near the end of this section.) | |
6bf7aab6 DL |
431 | |
432 | For example, @samp{f} is not a special character, so it is ordinary, and | |
433 | therefore @samp{f} is a regular expression that matches the string | |
434 | @samp{f} and no other string. (It does @emph{not} match the string | |
435 | @samp{ff}.) Likewise, @samp{o} is a regular expression that matches | |
436 | only @samp{o}. (When case distinctions are being ignored, these regexps | |
437 | also match @samp{F} and @samp{O}, but we consider this a generalization | |
438 | of ``the same string,'' rather than an exception.) | |
439 | ||
440 | Any two regular expressions @var{a} and @var{b} can be concatenated. The | |
441 | result is a regular expression which matches a string if @var{a} matches | |
442 | some amount of the beginning of that string and @var{b} matches the rest of | |
443 | the string.@refill | |
444 | ||
445 | As a simple example, we can concatenate the regular expressions @samp{f} | |
446 | and @samp{o} to get the regular expression @samp{fo}, which matches only | |
447 | the string @samp{fo}. Still trivial. To do something nontrivial, you | |
448 | need to use one of the special characters. Here is a list of them. | |
449 | ||
f3143102 RS |
450 | @table @asis |
451 | @item @kbd{.}@: @r{(Period)} | |
6bf7aab6 DL |
452 | is a special character that matches any single character except a newline. |
453 | Using concatenation, we can make regular expressions like @samp{a.b}, which | |
454 | matches any three-character string that begins with @samp{a} and ends with | |
455 | @samp{b}.@refill | |
456 | ||
f3143102 | 457 | @item @kbd{*} |
6bf7aab6 DL |
458 | is not a construct by itself; it is a postfix operator that means to |
459 | match the preceding regular expression repetitively as many times as | |
460 | possible. Thus, @samp{o*} matches any number of @samp{o}s (including no | |
461 | @samp{o}s). | |
462 | ||
463 | @samp{*} always applies to the @emph{smallest} possible preceding | |
464 | expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating | |
465 | @samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on. | |
466 | ||
467 | The matcher processes a @samp{*} construct by matching, immediately, | |
468 | as many repetitions as can be found. Then it continues with the rest | |
469 | of the pattern. If that fails, backtracking occurs, discarding some | |
470 | of the matches of the @samp{*}-modified construct in case that makes | |
471 | it possible to match the rest of the pattern. For example, in matching | |
472 | @samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first | |
473 | tries to match all three @samp{a}s; but the rest of the pattern is | |
474 | @samp{ar} and there is only @samp{r} left to match, so this try fails. | |
475 | The next alternative is for @samp{a*} to match only two @samp{a}s. | |
476 | With this choice, the rest of the regexp matches successfully.@refill | |
477 | ||
f3143102 | 478 | @item @kbd{+} |
6bf7aab6 DL |
479 | is a postfix operator, similar to @samp{*} except that it must match |
480 | the preceding expression at least once. So, for example, @samp{ca+r} | |
481 | matches the strings @samp{car} and @samp{caaaar} but not the string | |
482 | @samp{cr}, whereas @samp{ca*r} matches all three strings. | |
483 | ||
f3143102 | 484 | @item @kbd{?} |
6bf7aab6 DL |
485 | is a postfix operator, similar to @samp{*} except that it can match the |
486 | preceding expression either once or not at all. For example, | |
487 | @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else. | |
488 | ||
f3143102 | 489 | @item @kbd{*?}, @kbd{+?}, @kbd{??} |
f1a88ed9 | 490 | @cindex non-greedy regexp matching |
8964fec7 | 491 | are non-greedy variants of the operators above. The normal operators |
91cf1909 RS |
492 | @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as |
493 | much as they can, as long as the overall regexp can still match. With | |
494 | a following @samp{?}, they are non-greedy: they will match as little | |
495 | as possible. | |
496 | ||
497 | Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a} | |
498 | and the string @samp{abbbb}; but if you try to match them both against | |
499 | the text @samp{abbb}, @samp{ab*} will match it all (the longest valid | |
500 | match), while @samp{ab*?} will match just @samp{a} (the shortest | |
501 | valid match). | |
502 | ||
bed0fc91 RS |
503 | Non-greedy operators match the shortest possible string starting at a |
504 | given starting point; in a forward search, though, the earliest | |
505 | possible starting point for match is always the one chosen. Thus, if | |
506 | you search for @samp{a.*?$} against the text @samp{abbab} followed by | |
507 | a newline, it matches the whole string. Since it @emph{can} match | |
508 | starting at the first @samp{a}, it does. | |
509 | ||
f3143102 | 510 | @item @kbd{\@{@var{n}\@}} |
91cf1909 RS |
511 | is a postfix operator that specifies repetition @var{n} times---that |
512 | is, the preceding regular expression must match exactly @var{n} times | |
513 | in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx} | |
514 | and nothing else. | |
8964fec7 | 515 | |
f3143102 | 516 | @item @kbd{\@{@var{n},@var{m}\@}} |
91cf1909 RS |
517 | is a postfix operator that specifies repetition between @var{n} and |
518 | @var{m} times---that is, the preceding regular expression must match | |
519 | at least @var{n} times, but no more than @var{m} times. If @var{m} is | |
520 | omitted, then there is no upper limit, but the preceding regular | |
521 | expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is | |
522 | equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to | |
523 | @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}. | |
8a44227a | 524 | |
f3143102 | 525 | @item @kbd{[ @dots{} ]} |
6bf7aab6 DL |
526 | is a @dfn{character set}, which begins with @samp{[} and is terminated |
527 | by @samp{]}. In the simplest case, the characters between the two | |
528 | brackets are what this set can match. | |
529 | ||
530 | Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and | |
531 | @samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s | |
532 | (including the empty string), from which it follows that @samp{c[ad]*r} | |
533 | matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc. | |
534 | ||
535 | You can also include character ranges in a character set, by writing the | |
536 | starting and ending characters with a @samp{-} between them. Thus, | |
76dd3692 | 537 | @samp{[a-z]} matches any lower-case @acronym{ASCII} letter. Ranges may be |
6bf7aab6 | 538 | intermixed freely with individual characters, as in @samp{[a-z$%.]}, |
76dd3692 | 539 | which matches any lower-case @acronym{ASCII} letter or @samp{$}, @samp{%} or |
6bf7aab6 DL |
540 | period. |
541 | ||
542 | Note that the usual regexp special characters are not special inside a | |
543 | character set. A completely different set of special characters exists | |
544 | inside character sets: @samp{]}, @samp{-} and @samp{^}. | |
545 | ||
546 | To include a @samp{]} in a character set, you must make it the first | |
547 | character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To | |
548 | include a @samp{-}, write @samp{-} as the first or last character of the | |
549 | set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]} | |
550 | and @samp{-}. | |
551 | ||
552 | To include @samp{^} in a set, put it anywhere but at the beginning of | |
b20a1c88 | 553 | the set. (At the beginning, it complements the set---see below.) |
6bf7aab6 DL |
554 | |
555 | When you use a range in case-insensitive search, you should write both | |
556 | ends of the range in upper case, or both in lower case, or both should | |
557 | be non-letters. The behavior of a mixed-case range such as @samp{A-z} | |
558 | is somewhat ill-defined, and it may change in future Emacs versions. | |
559 | ||
f3143102 | 560 | @item @kbd{[^ @dots{} ]} |
6bf7aab6 DL |
561 | @samp{[^} begins a @dfn{complemented character set}, which matches any |
562 | character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches | |
76dd3692 | 563 | all characters @emph{except} @acronym{ASCII} letters and digits. |
6bf7aab6 DL |
564 | |
565 | @samp{^} is not special in a character set unless it is the first | |
566 | character. The character following the @samp{^} is treated as if it | |
567 | were first (in other words, @samp{-} and @samp{]} are not special there). | |
568 | ||
569 | A complemented character set can match a newline, unless newline is | |
570 | mentioned as one of the characters not to match. This is in contrast to | |
571 | the handling of regexps in programs such as @code{grep}. | |
572 | ||
f3143102 | 573 | @item @kbd{^} |
6bf7aab6 DL |
574 | is a special character that matches the empty string, but only at the |
575 | beginning of a line in the text being matched. Otherwise it fails to | |
576 | match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at | |
577 | the beginning of a line. | |
578 | ||
f3143102 | 579 | @item @kbd{$} |
6bf7aab6 DL |
580 | is similar to @samp{^} but matches only at the end of a line. Thus, |
581 | @samp{x+$} matches a string of one @samp{x} or more at the end of a line. | |
582 | ||
f3143102 | 583 | @item @kbd{\} |
6bf7aab6 DL |
584 | has two functions: it quotes the special characters (including |
585 | @samp{\}), and it introduces additional special constructs. | |
586 | ||
587 | Because @samp{\} quotes special characters, @samp{\$} is a regular | |
588 | expression that matches only @samp{$}, and @samp{\[} is a regular | |
589 | expression that matches only @samp{[}, and so on. | |
590 | @end table | |
591 | ||
592 | Note: for historical compatibility, special characters are treated as | |
593 | ordinary ones if they are in contexts where their special meanings make no | |
594 | sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is | |
595 | no preceding expression on which the @samp{*} can act. It is poor practice | |
596 | to depend on this behavior; it is better to quote the special character anyway, | |
597 | regardless of where it appears.@refill | |
598 | ||
599 | For the most part, @samp{\} followed by any character matches only that | |
600 | character. However, there are several exceptions: two-character | |
601 | sequences starting with @samp{\} that have special meanings. The second | |
602 | character in the sequence is always an ordinary character when used on | |
603 | its own. Here is a table of @samp{\} constructs. | |
604 | ||
605 | @table @kbd | |
606 | @item \| | |
607 | specifies an alternative. Two regular expressions @var{a} and @var{b} | |
608 | with @samp{\|} in between form an expression that matches some text if | |
609 | either @var{a} matches it or @var{b} matches it. It works by trying to | |
610 | match @var{a}, and if that fails, by trying to match @var{b}. | |
611 | ||
612 | Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar} | |
613 | but no other string.@refill | |
614 | ||
615 | @samp{\|} applies to the largest possible surrounding expressions. Only a | |
616 | surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of | |
617 | @samp{\|}.@refill | |
618 | ||
619 | Full backtracking capability exists to handle multiple uses of @samp{\|}. | |
620 | ||
621 | @item \( @dots{} \) | |
622 | is a grouping construct that serves three purposes: | |
623 | ||
624 | @enumerate | |
625 | @item | |
626 | To enclose a set of @samp{\|} alternatives for other operations. | |
627 | Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}. | |
628 | ||
629 | @item | |
630 | To enclose a complicated expression for the postfix operators @samp{*}, | |
631 | @samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches | |
632 | @samp{bananana}, etc., with any (zero or more) number of @samp{na} | |
633 | strings.@refill | |
634 | ||
635 | @item | |
636 | To record a matched substring for future reference. | |
637 | @end enumerate | |
638 | ||
639 | This last application is not a consequence of the idea of a | |
640 | parenthetical grouping; it is a separate feature that is assigned as a | |
641 | second meaning to the same @samp{\( @dots{} \)} construct. In practice | |
91cf1909 RS |
642 | there is usually no conflict between the two meanings; when there is |
643 | a conflict, you can use a ``shy'' group. | |
95cd4c40 SM |
644 | |
645 | @item \(?: @dots{} \) | |
91cf1909 RS |
646 | @cindex shy group, in regexp |
647 | specifies a ``shy'' group that does not record the matched substring; | |
648 | you can't refer back to it with @samp{\@var{d}}. This is useful | |
649 | in mechanically combining regular expressions, so that you | |
650 | can add groups for syntactic purposes without interfering with | |
651 | the numbering of the groups that were written by the user. | |
6bf7aab6 DL |
652 | |
653 | @item \@var{d} | |
654 | matches the same text that matched the @var{d}th occurrence of a | |
655 | @samp{\( @dots{} \)} construct. | |
656 | ||
657 | After the end of a @samp{\( @dots{} \)} construct, the matcher remembers | |
658 | the beginning and end of the text matched by that construct. Then, | |
659 | later on in the regular expression, you can use @samp{\} followed by the | |
660 | digit @var{d} to mean ``match the same text matched the @var{d}th time | |
661 | by the @samp{\( @dots{} \)} construct.'' | |
662 | ||
663 | The strings matching the first nine @samp{\( @dots{} \)} constructs | |
664 | appearing in a regular expression are assigned numbers 1 through 9 in | |
665 | the order that the open-parentheses appear in the regular expression. | |
666 | So you can use @samp{\1} through @samp{\9} to refer to the text matched | |
667 | by the corresponding @samp{\( @dots{} \)} constructs. | |
668 | ||
669 | For example, @samp{\(.*\)\1} matches any newline-free string that is | |
670 | composed of two identical halves. The @samp{\(.*\)} matches the first | |
671 | half, which may be anything, but the @samp{\1} that follows must match | |
672 | the same exact text. | |
673 | ||
674 | If a particular @samp{\( @dots{} \)} construct matches more than once | |
675 | (which can easily happen if it is followed by @samp{*}), only the last | |
676 | match is recorded. | |
677 | ||
678 | @item \` | |
4aa2d40e RS |
679 | matches the empty string, but only at the beginning of the string or |
680 | buffer (or its accessible portion) being matched against. | |
6bf7aab6 DL |
681 | |
682 | @item \' | |
4aa2d40e RS |
683 | matches the empty string, but only at the end of the string or buffer |
684 | (or its accessible portion) being matched against. | |
6bf7aab6 DL |
685 | |
686 | @item \= | |
687 | matches the empty string, but only at point. | |
688 | ||
689 | @item \b | |
690 | matches the empty string, but only at the beginning or | |
691 | end of a word. Thus, @samp{\bfoo\b} matches any occurrence of | |
692 | @samp{foo} as a separate word. @samp{\bballs?\b} matches | |
693 | @samp{ball} or @samp{balls} as a separate word.@refill | |
694 | ||
695 | @samp{\b} matches at the beginning or end of the buffer | |
696 | regardless of what text appears next to it. | |
697 | ||
698 | @item \B | |
699 | matches the empty string, but @emph{not} at the beginning or | |
700 | end of a word. | |
701 | ||
702 | @item \< | |
703 | matches the empty string, but only at the beginning of a word. | |
704 | @samp{\<} matches at the beginning of the buffer only if a | |
705 | word-constituent character follows. | |
706 | ||
707 | @item \> | |
708 | matches the empty string, but only at the end of a word. @samp{\>} | |
709 | matches at the end of the buffer only if the contents end with a | |
710 | word-constituent character. | |
711 | ||
712 | @item \w | |
713 | matches any word-constituent character. The syntax table | |
714 | determines which characters these are. @xref{Syntax}. | |
715 | ||
716 | @item \W | |
717 | matches any character that is not a word-constituent. | |
718 | ||
719 | @item \s@var{c} | |
720 | matches any character whose syntax is @var{c}. Here @var{c} is a | |
b20a1c88 RS |
721 | character that designates a particular syntax class: thus, @samp{w} |
722 | for word constituent, @samp{-} or @samp{ } for whitespace, @samp{.} | |
723 | for ordinary punctuation, etc. @xref{Syntax}. | |
6bf7aab6 DL |
724 | |
725 | @item \S@var{c} | |
726 | matches any character whose syntax is not @var{c}. | |
4a1b539b EZ |
727 | |
728 | @cindex categories of characters | |
729 | @cindex characters which belong to a specific language | |
730 | @findex describe-categories | |
731 | @item \c@var{c} | |
732 | matches any character that belongs to the category @var{c}. For | |
733 | example, @samp{\cc} matches Chinese characters, @samp{\cg} matches | |
734 | Greek characters, etc. For the description of the known categories, | |
735 | type @kbd{M-x describe-categories @key{RET}}. | |
736 | ||
737 | @item \C@var{c} | |
738 | matches any character that does @emph{not} belong to category | |
739 | @var{c}. | |
6bf7aab6 DL |
740 | @end table |
741 | ||
742 | The constructs that pertain to words and syntax are controlled by the | |
743 | setting of the syntax table (@pxref{Syntax}). | |
744 | ||
b20a1c88 RS |
745 | Here is a complicated regexp, stored in @code{sentence-end} and used |
746 | by Emacs to recognize the end of a sentence together with any | |
c00394a3 | 747 | whitespace that follows. We show its Lisp syntax to distinguish the |
b20a1c88 RS |
748 | spaces from the tab characters. In Lisp syntax, the string constant |
749 | begins and ends with a double-quote. @samp{\"} stands for a | |
750 | double-quote as part of the regexp, @samp{\\} for a backslash as part | |
751 | of the regexp, @samp{\t} for a tab, and @samp{\n} for a newline. | |
6bf7aab6 DL |
752 | |
753 | @example | |
b20a1c88 | 754 | "[.?!][]\"')]*\\($\\| $\\|\t\\| \\)[ \t\n]*" |
6bf7aab6 DL |
755 | @end example |
756 | ||
757 | @noindent | |
b20a1c88 RS |
758 | This contains four parts in succession: a character set matching |
759 | period, @samp{?}, or @samp{!}; a character set matching | |
760 | close-brackets, quotes, or parentheses, repeated zero or more times; a | |
761 | set of alternatives within backslash-parentheses that matches either | |
762 | end-of-line, a space at the end of a line, a tab, or two spaces; and a | |
763 | character set matching whitespace characters, repeated any number of | |
764 | times. | |
6bf7aab6 | 765 | |
31b5be12 RS |
766 | To enter the same regexp in incremental search, you would type |
767 | @key{TAB} to enter a tab, and @kbd{C-j} to enter a newline. You would | |
768 | also type single backslashes as themselves, instead of doubling them | |
769 | for Lisp syntax. In commands that use ordinary minibuffer input to | |
770 | read a regexp, you would quote the @kbd{C-j} by preceding it with a | |
771 | @kbd{C-q} to prevent @kbd{C-j} from exiting the minibuffer. | |
6bf7aab6 | 772 | |
91cf1909 RS |
773 | @ignore |
774 | @c I commented this out because it is missing vital information | |
775 | @c and therefore useless. For instance, what do you do to *use* the | |
776 | @c regular expression when it is finished? What jobs is this good for? | |
777 | @c -- rms | |
778 | ||
3adfa3db EZ |
779 | @findex re-builder |
780 | @cindex authoring regular expressions | |
91cf1909 RS |
781 | For convenient interactive development of regular expressions, you |
782 | can use the @kbd{M-x re-builder} command. It provides a convenient | |
783 | interface for creating regular expressions, by giving immediate visual | |
784 | feedback. The buffer from which @code{re-builder} was invoked becomes | |
785 | the target for the regexp editor, which pops in a separate window. At | |
786 | all times, all the matches in the target buffer for the current | |
787 | regular expression are highlighted. Each parenthesized sub-expression | |
788 | of the regexp is shown in a distinct face, which makes it easier to | |
789 | verify even very complex regexps. (On displays that don't support | |
790 | colors, Emacs blinks the cursor around the matched text, as it does | |
791 | for matching parens.) | |
792 | @end ignore | |
3adfa3db | 793 | |
a57bfc9f | 794 | @node Search Case, Configuring Scrolling, Regexps, Search |
6bf7aab6 DL |
795 | @section Searching and Case |
796 | ||
6bf7aab6 DL |
797 | Incremental searches in Emacs normally ignore the case of the text |
798 | they are searching through, if you specify the text in lower case. | |
799 | Thus, if you specify searching for @samp{foo}, then @samp{Foo} and | |
800 | @samp{foo} are also considered a match. Regexps, and in particular | |
801 | character sets, are included: @samp{[ab]} would match @samp{a} or | |
802 | @samp{A} or @samp{b} or @samp{B}.@refill | |
803 | ||
804 | An upper-case letter anywhere in the incremental search string makes | |
805 | the search case-sensitive. Thus, searching for @samp{Foo} does not find | |
806 | @samp{foo} or @samp{FOO}. This applies to regular expression search as | |
807 | well as to string search. The effect ceases if you delete the | |
808 | upper-case letter from the search string. | |
809 | ||
b20a1c88 RS |
810 | Typing @kbd{M-c} within an incremental search toggles the case |
811 | sensitivity of that search. The effect does not extend beyond the | |
812 | current incremental search to the next one, but it does override the | |
813 | effect of including an upper-case letter in the current search. | |
814 | ||
815 | @vindex case-fold-search | |
6bf7aab6 DL |
816 | If you set the variable @code{case-fold-search} to @code{nil}, then |
817 | all letters must match exactly, including case. This is a per-buffer | |
818 | variable; altering the variable affects only the current buffer, but | |
819 | there is a default value which you can change as well. @xref{Locals}. | |
820 | This variable applies to nonincremental searches also, including those | |
821 | performed by the replace commands (@pxref{Replace}) and the minibuffer | |
822 | history matching commands (@pxref{Minibuffer History}). | |
823 | ||
a57bfc9f EZ |
824 | @node Configuring Scrolling, Replace, Search Case, Search |
825 | @section Configuring Scrolling | |
826 | @cindex scrolling in incremental search | |
827 | @vindex isearch-allow-scroll | |
828 | ||
829 | Scrolling, etc., during incremental search is enabled by setting the | |
830 | customizable variable @code{isearch-allow-scroll} to a non-nil value. | |
831 | ||
832 | @c See Subject: Info file: How do I get an itemized list without blank lines? | |
833 | @c Date: Sat, 12 Apr 2003 09:45:31 +0000 in gnu.emacs.help | |
834 | @subsection Standard scrolling commands | |
835 | Here is the list of commands which are configured by default to be | |
836 | ``scrolling'' commands in an incremental search, together with their | |
837 | usual bindings: | |
838 | @subsubsection Commands which scroll the window: | |
839 | @table @asis | |
840 | @item @code{scroll-bar-toolkit-scroll} (@kbd{@key{vertical-scroll-bar}@key{mouse-1}} in X-Windows) | |
841 | @itemx @code{mac-handle-scroll-bar-event} (@kbd{@key{vertical-scroll-bar}@key{mouse-1}} on a Mac) | |
842 | @itemx @code{w32-handle-scroll-bar-event} (@kbd{@key{vertical-scroll-bar}@key{mouse-1}} in MS-Windows) | |
843 | @item @code{recenter} (@kbd{C-l}) @xref{Scrolling}. | |
844 | @itemx @code{reposition-window} (@kbd{C-M-l}) @xref{Scrolling}. | |
845 | @itemx @code{scroll-up} (@kbd{@key{NEXT}}) @xref{Scrolling}. | |
846 | @itemx @code{scroll-down} (@kbd{@key{PRIOR}}) @xref{Scrolling}. | |
847 | @end table | |
848 | ||
849 | @subsubsection Commands which act on the other window: | |
850 | @table @asis | |
851 | @item @code{list-buffers} (@kbd{C-x C-b}) @xref{List Buffers}. | |
852 | @itemx @code{scroll-other-window} (@kbd{C-M-v}) @xref{Other Window}. | |
853 | @itemx @code{scroll-other-window-down} (@kbd{C-M-S-v}) @xref{Other Window}. | |
854 | @itemx @code{beginning-of-buffer-other-window} (@kbd{M-@key{home}}) | |
855 | @itemx @code{end-of-buffer-other-window} (@kbd{M-@key{end}}) | |
856 | @end table | |
857 | ||
858 | @subsubsection Commands which change the window layout: | |
859 | @table @asis | |
860 | @item @code{delete-other-windows} (@kbd{C-x 1}) @xref{Change Window}. | |
861 | @itemx @code{balance-windows} (@kbd{C-x +}) @xref{Change Window}. | |
862 | @itemx @code{split-window-vertically} (@kbd{C-x 2}) @xref{Split Window}. | |
863 | @itemx @code{enlarge-window} (@kbd{C-x ^}) @xref{Change Window}. | |
864 | @end table | |
865 | ||
866 | @subsection Configuring other commands as scrolling commands | |
867 | To do this, set a command's isearch-scroll property to the value t. | |
868 | For example: | |
869 | ||
870 | @example | |
871 | @code{(put 'my-command 'isearch-scroll t)} | |
872 | @end example | |
873 | ||
874 | You should only thus configure commands which are ``safe'': i.e., they | |
875 | won't leave emacs in an inconsistent state when executed within a | |
876 | search - that is to say, the following things may be changed by a | |
877 | command only temporarily, and must be restored before the command | |
878 | finishes: | |
879 | ||
880 | @enumerate | |
881 | @item | |
882 | Point. | |
883 | @item | |
884 | The buffer contents. | |
885 | @item | |
886 | The selected window and selected frame. | |
887 | @item | |
888 | The current match-data @xref{Match Data,,,elisp}. | |
889 | @end enumerate | |
890 | ||
891 | Additionally, the command must not delete the current window and must | |
892 | not itself attempt an incremental search. It may, however, change the | |
893 | window's size, or create or delete other windows and frames. | |
894 | ||
895 | Note that an attempt by a command to scroll the text | |
896 | @emph{horizontally} won't work, although it will do no harm - any such | |
897 | scrolling will be overriden and nullified by the display code. | |
898 | ||
899 | @node Replace, Other Repeating Search, Configuring Scrolling, Search | |
6bf7aab6 DL |
900 | @section Replacement Commands |
901 | @cindex replacement | |
902 | @cindex search-and-replace commands | |
903 | @cindex string substitution | |
904 | @cindex global substitution | |
905 | ||
a76af65d RS |
906 | Global search-and-replace operations are not needed often in Emacs, |
907 | but they are available. In addition to the simple @kbd{M-x | |
908 | replace-string} command which is like that found in most editors, | |
909 | there is a @kbd{M-x query-replace} command which finds each occurrence | |
910 | of the pattern and asks you whether to replace it. | |
6bf7aab6 DL |
911 | |
912 | The replace commands normally operate on the text from point to the | |
293fa54a EZ |
913 | end of the buffer; however, in Transient Mark mode (@pxref{Transient |
914 | Mark}), when the mark is active, they operate on the region. The | |
915 | replace commands all replace one string (or regexp) with one | |
916 | replacement string. It is possible to perform several replacements in | |
917 | parallel using the command @code{expand-region-abbrevs} | |
918 | (@pxref{Expanding Abbrevs}). | |
6bf7aab6 DL |
919 | |
920 | @menu | |
a57bfc9f EZ |
921 | * Unconditional Replace:: Replacing all matches for a string. |
922 | * Regexp Replace:: Replacing all matches for a regexp. | |
923 | * Replacement and Case:: How replacements preserve case of letters. | |
924 | * Query Replace:: How to use querying. | |
6bf7aab6 DL |
925 | @end menu |
926 | ||
927 | @node Unconditional Replace, Regexp Replace, Replace, Replace | |
928 | @subsection Unconditional Replacement | |
929 | @findex replace-string | |
930 | @findex replace-regexp | |
931 | ||
932 | @table @kbd | |
933 | @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET} | |
934 | Replace every occurrence of @var{string} with @var{newstring}. | |
935 | @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET} | |
936 | Replace every match for @var{regexp} with @var{newstring}. | |
937 | @end table | |
938 | ||
939 | To replace every instance of @samp{foo} after point with @samp{bar}, | |
940 | use the command @kbd{M-x replace-string} with the two arguments | |
941 | @samp{foo} and @samp{bar}. Replacement happens only in the text after | |
942 | point, so if you want to cover the whole buffer you must go to the | |
943 | beginning first. All occurrences up to the end of the buffer are | |
944 | replaced; to limit replacement to part of the buffer, narrow to that | |
945 | part of the buffer before doing the replacement (@pxref{Narrowing}). | |
946 | In Transient Mark mode, when the region is active, replacement is | |
947 | limited to the region (@pxref{Transient Mark}). | |
948 | ||
949 | When @code{replace-string} exits, it leaves point at the last | |
950 | occurrence replaced. It sets the mark to the prior position of point | |
951 | (where the @code{replace-string} command was issued); use @kbd{C-u | |
952 | C-@key{SPC}} to move back there. | |
953 | ||
954 | A numeric argument restricts replacement to matches that are surrounded | |
955 | by word boundaries. The argument's value doesn't matter. | |
956 | ||
46b1e9bb RS |
957 | What if you want to exchange @samp{x} and @samp{y}: replace every @samp{x} with a @samp{y} and vice versa? You can do it this way: |
958 | ||
959 | @example | |
960 | M-x query-replace @key{RET} x @key{RET} @@TEMP@@ @key{RET} | |
961 | M-x query-replace @key{RET} y @key{RET} x @key{RET} | |
962 | M-x query-replace @key{RET} @@TEMP@@ @key{RET} y @key{RET} | |
963 | @end example | |
964 | ||
965 | @noindent | |
966 | This works provided the string @samp{@@TEMP@@} does not appear | |
967 | in your text. | |
968 | ||
6bf7aab6 DL |
969 | @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace |
970 | @subsection Regexp Replacement | |
971 | ||
972 | The @kbd{M-x replace-string} command replaces exact matches for a | |
973 | single string. The similar command @kbd{M-x replace-regexp} replaces | |
974 | any match for a specified pattern. | |
975 | ||
976 | In @code{replace-regexp}, the @var{newstring} need not be constant: it | |
977 | can refer to all or part of what is matched by the @var{regexp}. | |
978 | @samp{\&} in @var{newstring} stands for the entire match being replaced. | |
979 | @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for | |
980 | whatever matched the @var{d}th parenthesized grouping in @var{regexp}. | |
981 | To include a @samp{\} in the text to replace with, you must enter | |
982 | @samp{\\}. For example, | |
983 | ||
984 | @example | |
985 | M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET} | |
986 | @end example | |
987 | ||
988 | @noindent | |
989 | replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr} | |
990 | with @samp{cddr-safe}. | |
991 | ||
992 | @example | |
993 | M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET} | |
994 | @end example | |
995 | ||
996 | @noindent | |
997 | performs the inverse transformation. | |
998 | ||
999 | @node Replacement and Case, Query Replace, Regexp Replace, Replace | |
1000 | @subsection Replace Commands and Case | |
1001 | ||
1002 | If the first argument of a replace command is all lower case, the | |
e7ad2d23 | 1003 | command ignores case while searching for occurrences to |
6bf7aab6 DL |
1004 | replace---provided @code{case-fold-search} is non-@code{nil}. If |
1005 | @code{case-fold-search} is set to @code{nil}, case is always significant | |
1006 | in all searches. | |
1007 | ||
1008 | @vindex case-replace | |
1009 | In addition, when the @var{newstring} argument is all or partly lower | |
1010 | case, replacement commands try to preserve the case pattern of each | |
1011 | occurrence. Thus, the command | |
1012 | ||
1013 | @example | |
1014 | M-x replace-string @key{RET} foo @key{RET} bar @key{RET} | |
1015 | @end example | |
1016 | ||
1017 | @noindent | |
1018 | replaces a lower case @samp{foo} with a lower case @samp{bar}, an | |
1019 | all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with | |
1020 | @samp{Bar}. (These three alternatives---lower case, all caps, and | |
1021 | capitalized, are the only ones that @code{replace-string} can | |
1022 | distinguish.) | |
1023 | ||
1024 | If upper-case letters are used in the replacement string, they remain | |
1025 | upper case every time that text is inserted. If upper-case letters are | |
1026 | used in the first argument, the second argument is always substituted | |
1027 | exactly as given, with no case conversion. Likewise, if either | |
1028 | @code{case-replace} or @code{case-fold-search} is set to @code{nil}, | |
1029 | replacement is done without case conversion. | |
1030 | ||
1031 | @node Query Replace,, Replacement and Case, Replace | |
1032 | @subsection Query Replace | |
1033 | @cindex query replace | |
1034 | ||
1035 | @table @kbd | |
1036 | @item M-% @var{string} @key{RET} @var{newstring} @key{RET} | |
1037 | @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET} | |
1038 | Replace some occurrences of @var{string} with @var{newstring}. | |
1039 | @item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET} | |
1040 | @itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET} | |
1041 | Replace some matches for @var{regexp} with @var{newstring}. | |
1042 | @end table | |
1043 | ||
1044 | @kindex M-% | |
1045 | @findex query-replace | |
1046 | If you want to change only some of the occurrences of @samp{foo} to | |
1047 | @samp{bar}, not all of them, then you cannot use an ordinary | |
1048 | @code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}). | |
1049 | This command finds occurrences of @samp{foo} one by one, displays each | |
b20a1c88 RS |
1050 | occurrence and asks you whether to replace it. Aside from querying, |
1051 | @code{query-replace} works just like @code{replace-string}. It | |
1052 | preserves case, like @code{replace-string}, provided | |
1053 | @code{case-replace} is non-@code{nil}, as it normally is. A numeric | |
1054 | argument means consider only occurrences that are bounded by | |
1055 | word-delimiter characters. | |
6bf7aab6 DL |
1056 | |
1057 | @kindex C-M-% | |
1058 | @findex query-replace-regexp | |
b20a1c88 | 1059 | @kbd{C-M-%} performs regexp search and replace (@code{query-replace-regexp}). |
6bf7aab6 | 1060 | |
b20a1c88 RS |
1061 | The characters you can type when you are shown a match for the string |
1062 | or regexp are: | |
6bf7aab6 DL |
1063 | |
1064 | @ignore @c Not worth it. | |
1065 | @kindex SPC @r{(query-replace)} | |
1066 | @kindex DEL @r{(query-replace)} | |
1067 | @kindex , @r{(query-replace)} | |
1068 | @kindex RET @r{(query-replace)} | |
1069 | @kindex . @r{(query-replace)} | |
1070 | @kindex ! @r{(query-replace)} | |
1071 | @kindex ^ @r{(query-replace)} | |
1072 | @kindex C-r @r{(query-replace)} | |
1073 | @kindex C-w @r{(query-replace)} | |
1074 | @kindex C-l @r{(query-replace)} | |
1075 | @end ignore | |
1076 | ||
1077 | @c WideCommands | |
1078 | @table @kbd | |
1079 | @item @key{SPC} | |
1080 | to replace the occurrence with @var{newstring}. | |
1081 | ||
1082 | @item @key{DEL} | |
1083 | to skip to the next occurrence without replacing this one. | |
1084 | ||
1085 | @item , @r{(Comma)} | |
1086 | to replace this occurrence and display the result. You are then asked | |
1087 | for another input character to say what to do next. Since the | |
1088 | replacement has already been made, @key{DEL} and @key{SPC} are | |
1089 | equivalent in this situation; both move to the next occurrence. | |
1090 | ||
1091 | You can type @kbd{C-r} at this point (see below) to alter the replaced | |
1092 | text. You can also type @kbd{C-x u} to undo the replacement; this exits | |
1093 | the @code{query-replace}, so if you want to do further replacement you | |
1094 | must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart | |
1095 | (@pxref{Repetition}). | |
1096 | ||
1097 | @item @key{RET} | |
1098 | to exit without doing any more replacements. | |
1099 | ||
1100 | @item .@: @r{(Period)} | |
1101 | to replace this occurrence and then exit without searching for more | |
1102 | occurrences. | |
1103 | ||
1104 | @item ! | |
1105 | to replace all remaining occurrences without asking again. | |
1106 | ||
1107 | @item ^ | |
1108 | to go back to the position of the previous occurrence (or what used to | |
1109 | be an occurrence), in case you changed it by mistake. This works by | |
1110 | popping the mark ring. Only one @kbd{^} in a row is meaningful, because | |
1111 | only one previous replacement position is kept during @code{query-replace}. | |
1112 | ||
1113 | @item C-r | |
1114 | to enter a recursive editing level, in case the occurrence needs to be | |
1115 | edited rather than just replaced with @var{newstring}. When you are | |
1116 | done, exit the recursive editing level with @kbd{C-M-c} to proceed to | |
1117 | the next occurrence. @xref{Recursive Edit}. | |
1118 | ||
1119 | @item C-w | |
1120 | to delete the occurrence, and then enter a recursive editing level as in | |
1121 | @kbd{C-r}. Use the recursive edit to insert text to replace the deleted | |
1122 | occurrence of @var{string}. When done, exit the recursive editing level | |
1123 | with @kbd{C-M-c} to proceed to the next occurrence. | |
1124 | ||
91cf1909 RS |
1125 | @item e |
1126 | to edit the replacement string in the minibuffer. When you exit the | |
1127 | minibuffer by typing @key{RET}, the minibuffer contents replace the | |
1128 | current occurrence of the pattern. They also become the new | |
1129 | replacement string for any further occurrences. | |
1130 | ||
6bf7aab6 DL |
1131 | @item C-l |
1132 | to redisplay the screen. Then you must type another character to | |
1133 | specify what to do with this occurrence. | |
1134 | ||
1135 | @item C-h | |
1136 | to display a message summarizing these options. Then you must type | |
1137 | another character to specify what to do with this occurrence. | |
1138 | @end table | |
1139 | ||
1140 | Some other characters are aliases for the ones listed above: @kbd{y}, | |
1141 | @kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and | |
1142 | @key{RET}. | |
1143 | ||
1144 | Aside from this, any other character exits the @code{query-replace}, | |
1145 | and is then reread as part of a key sequence. Thus, if you type | |
1146 | @kbd{C-k}, it exits the @code{query-replace} and then kills to end of | |
1147 | line. | |
1148 | ||
1149 | To restart a @code{query-replace} once it is exited, use @kbd{C-x | |
1150 | @key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it | |
1151 | used the minibuffer to read its arguments. @xref{Repetition, C-x ESC | |
1152 | ESC}. | |
1153 | ||
1154 | See also @ref{Transforming File Names}, for Dired commands to rename, | |
1155 | copy, or link files by replacing regexp matches in file names. | |
1156 | ||
1157 | @node Other Repeating Search,, Replace, Search | |
1158 | @section Other Search-and-Loop Commands | |
1159 | ||
1160 | Here are some other commands that find matches for a regular | |
91cf1909 RS |
1161 | expression. They all ignore case in matching, if the pattern contains |
1162 | no upper-case letters and @code{case-fold-search} is non-@code{nil}. | |
8cab3b9a CW |
1163 | Aside from @code{occur} and its variants, all operate on the text from |
1164 | point to the end of the buffer, or on the active region in Transient | |
1165 | Mark mode. | |
6bf7aab6 DL |
1166 | |
1167 | @findex list-matching-lines | |
1168 | @findex occur | |
8cab3b9a CW |
1169 | @findex multi-occur |
1170 | @findex multi-occur-by-filename-regexp | |
9c99d206 | 1171 | @findex how-many |
6bf7aab6 DL |
1172 | @findex delete-non-matching-lines |
1173 | @findex delete-matching-lines | |
1174 | @findex flush-lines | |
1175 | @findex keep-lines | |
1176 | ||
1177 | @table @kbd | |
1178 | @item M-x occur @key{RET} @var{regexp} @key{RET} | |
91cf1909 RS |
1179 | Display a list showing each line in the buffer that contains a match |
1180 | for @var{regexp}. To limit the search to part of the buffer, narrow | |
1181 | to that part (@pxref{Narrowing}). A numeric argument @var{n} | |
f8635375 EZ |
1182 | specifies that @var{n} lines of context are to be displayed before and |
1183 | after each matching line. | |
6bf7aab6 DL |
1184 | |
1185 | @kindex RET @r{(Occur mode)} | |
74308f28 RS |
1186 | @kindex o @r{(Occur mode)} |
1187 | @kindex C-o @r{(Occur mode)} | |
6bf7aab6 | 1188 | The buffer @samp{*Occur*} containing the output serves as a menu for |
74308f28 RS |
1189 | finding the occurrences in their original context. Click |
1190 | @kbd{Mouse-2} on an occurrence listed in @samp{*Occur*}, or position | |
1191 | point there and type @key{RET}; this switches to the buffer that was | |
1192 | searched and moves point to the original of the chosen occurrence. | |
1193 | @kbd{o} and @kbd{C-o} display the match in another window; @kbd{C-o} | |
1194 | does not select it. | |
6bf7aab6 DL |
1195 | |
1196 | @item M-x list-matching-lines | |
1197 | Synonym for @kbd{M-x occur}. | |
1198 | ||
8cab3b9a | 1199 | @item M-x multi-occur @key{RET} @var{buffers} @key{RET} @var{regexp} @key{RET} |
db639d24 | 1200 | This function is just like @code{occur}, except it is able to search |
8cab3b9a CW |
1201 | through multiple buffers. |
1202 | ||
1203 | @item M-x multi-occur-by-filename-regexp @key{RET} @var{bufregexp} @key{RET} @var{regexp} @key{RET} | |
db639d24 | 1204 | This function is similar to @code{multi-occur}, except the buffers to |
8cab3b9a CW |
1205 | search are specified by a regexp on their filename. |
1206 | ||
9c99d206 | 1207 | @item M-x how-many @key{RET} @var{regexp} @key{RET} |
91cf1909 RS |
1208 | Print the number of matches for @var{regexp} that exist in the buffer |
1209 | after point. In Transient Mark mode, if the region is active, the | |
1210 | command operates on the region instead. | |
6bf7aab6 DL |
1211 | |
1212 | @item M-x flush-lines @key{RET} @var{regexp} @key{RET} | |
91cf1909 RS |
1213 | Delete each line that contains a match for @var{regexp}, operating on |
1214 | the text after point. In Transient Mark mode, if the region is | |
1215 | active, the command operates on the region instead. | |
6bf7aab6 DL |
1216 | |
1217 | @item M-x keep-lines @key{RET} @var{regexp} @key{RET} | |
91cf1909 RS |
1218 | Delete each line that @emph{does not} contain a match for |
1219 | @var{regexp}, operating on the text after point. In Transient Mark | |
1220 | mode, if the region is active, the command operates on the region | |
1221 | instead. | |
6bf7aab6 DL |
1222 | @end table |
1223 | ||
91cf1909 RS |
1224 | You can also search multiple files under control of a tags table |
1225 | (@pxref{Tags Search}) or through Dired @kbd{A} command | |
1226 | (@pxref{Operating on Files}), or ask the @code{grep} program to do it | |
1227 | (@pxref{Grep Searching}). | |
ab5796a9 MB |
1228 | |
1229 | @ignore | |
1230 | arch-tag: fd9d8e77-66af-491c-b212-d80999613e3e | |
1231 | @end ignore |