Commit | Line | Data |
---|---|---|
a933dad1 | 1 | Debugging GNU Emacs |
437368fe | 2 | Copyright (c) 1985, 2000, 2001 Free Software Foundation, Inc. |
a933dad1 DL |
3 | |
4 | Permission is granted to anyone to make or distribute verbatim copies | |
5 | of this document as received, in any medium, provided that the | |
6 | copyright notice and permission notice are preserved, | |
7 | and that the distributor grants the recipient permission | |
8 | for further redistribution as permitted by this notice. | |
9 | ||
10 | Permission is granted to distribute modified versions | |
11 | of this document, or of portions of it, | |
12 | under the above conditions, provided also that they | |
13 | carry prominent notices stating who last changed them. | |
14 | ||
437368fe EZ |
15 | [People who debug Emacs on Windows using native Windows debuggers |
16 | should read the Windows-specific section near the end of this | |
17 | document.] | |
18 | ||
19 | It is a good idea to run Emacs under GDB (or some other suitable | |
20 | debugger) *all the time*. Then, when Emacs crashes, you will be able | |
21 | to debug the live process, not just a core dump. (This is especially | |
22 | important on systems which don't support core files, and instead print | |
23 | just the registers and some stack addresses.) | |
24 | ||
25 | If Emacs hangs, or seems to be stuck in some infinite loop, typing | |
26 | "kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to | |
27 | kick in, provided that you run under GDB. | |
28 | ||
29 | ** Getting control to the debugger | |
a933dad1 | 30 | |
3102e429 | 31 | `Fsignal' is a very useful place to put a breakpoint in. |
a933dad1 DL |
32 | All Lisp errors go through there. |
33 | ||
3102e429 EZ |
34 | It is useful, when debugging, to have a guaranteed way to return to |
35 | the debugger at any time. When using X, this is easy: type C-c at the | |
36 | window where Emacs is running under GDB, and it will stop Emacs just | |
37 | as it would stop any ordinary program. When Emacs is running in a | |
38 | terminal, things are not so easy. | |
39 | ||
40 | The src/.gdbinit file in the Emacs distribution arranges for SIGINT | |
41 | (C-g in Emacs) to be passed to Emacs and not give control back to GDB. | |
42 | On modern POSIX systems, you can override that with this command: | |
43 | ||
44 | handle int stop nopass | |
45 | ||
46 | After this `handle' command, SIGINT will return control to GDB. If | |
47 | you want the C-g to cause a QUIT within Emacs as well, omit the | |
48 | `nopass'. | |
49 | ||
50 | A technique that can work when `handle SIGINT' does not is to store | |
51 | the code for some character into the variable stop_character. Thus, | |
a933dad1 DL |
52 | |
53 | set stop_character = 29 | |
54 | ||
55 | makes Control-] (decimal code 29) the stop character. | |
56 | Typing Control-] will cause immediate stop. You cannot | |
57 | use the set command until the inferior process has been started. | |
58 | Put a breakpoint early in `main', or suspend the Emacs, | |
59 | to get an opportunity to do the set command. | |
60 | ||
a933dad1 DL |
61 | ** Examining Lisp object values. |
62 | ||
63 | When you have a live process to debug, and it has not encountered a | |
64 | fatal error, you can use the GDB command `pr'. First print the value | |
65 | in the ordinary way, with the `p' command. Then type `pr' with no | |
66 | arguments. This calls a subroutine which uses the Lisp printer. | |
67 | ||
437368fe EZ |
68 | Note: It is not a good idea to try `pr' if you know that Emacs is in |
69 | deep trouble: its stack smashed (e.g., if it encountered SIGSEGV due | |
70 | to stack overflow), or crucial data structures, such as `obarray', | |
71 | corrupted, etc. In such cases, the Emacs subroutine called by `pr' | |
72 | might make more damage, like overwrite some data that is important for | |
73 | debugging the original problem. | |
74 | ||
3102e429 EZ |
75 | Also, on some systems it is impossible to use `pr' if you stopped |
76 | Emacs while it was inside `select'. This is in fact what happens if | |
77 | you stop Emacs while it is waiting. In such a situation, don't try to | |
78 | use `pr'. Instead, use `s' to step out of the system call. Then | |
79 | Emacs will be between instructions and capable of handling `pr'. | |
a933dad1 | 80 | |
3102e429 EZ |
81 | If you can't use `pr' command, for whatever reason, you can fall back |
82 | on lower-level commands. Use the `xtype' command to print out the | |
83 | data type of the last data value. Once you know the data type, use | |
84 | the command that corresponds to that type. Here are these commands: | |
a933dad1 DL |
85 | |
86 | xint xptr xwindow xmarker xoverlay xmiscfree xintfwd xboolfwd xobjfwd | |
87 | xbufobjfwd xkbobjfwd xbuflocal xbuffer xsymbol xstring xvector xframe | |
88 | xwinconfig xcompiled xcons xcar xcdr xsubr xprocess xfloat xscrollbar | |
89 | ||
90 | Each one of them applies to a certain type or class of types. | |
91 | (Some of these types are not visible in Lisp, because they exist only | |
92 | internally.) | |
93 | ||
94 | Each x... command prints some information about the value, and | |
95 | produces a GDB value (subsequently available in $) through which you | |
96 | can get at the rest of the contents. | |
97 | ||
437368fe | 98 | In general, most of the rest of the contents will be additional Lisp |
a933dad1 DL |
99 | objects which you can examine in turn with the x... commands. |
100 | ||
437368fe EZ |
101 | Even with a live process, these x... commands are useful for |
102 | examining the fields in a buffer, window, process, frame or marker. | |
103 | Here's an example using concepts explained in the node "Value History" | |
104 | of the GDB manual to print the variable frame from this line in | |
105 | xmenu.c: | |
106 | ||
107 | buf.frame_or_window = frame; | |
108 | ||
109 | First, use these commands: | |
110 | ||
111 | cd src | |
112 | gdb emacs | |
113 | b xmenu.c:1296 | |
114 | r -q | |
115 | ||
116 | Then type C-x 5 2 to create a new frame, and it hits the breakpoint: | |
117 | ||
118 | (gdb) p frame | |
119 | $1 = 1077872640 | |
120 | (gdb) xtype | |
121 | Lisp_Vectorlike | |
122 | PVEC_FRAME | |
123 | (gdb) xframe | |
124 | $2 = (struct frame *) 0x3f0800 | |
125 | (gdb) p *$ | |
126 | $3 = { | |
127 | size = 536871989, | |
128 | next = 0x366240, | |
129 | name = 809661752, | |
130 | [...] | |
131 | } | |
132 | (gdb) p $3->name | |
133 | $4 = 809661752 | |
134 | ||
135 | Now we can use `pr' to print the name of the frame: | |
136 | ||
137 | (gdb) pr | |
138 | "emacs@steenrod.math.nwu.edu" | |
139 | ||
140 | The Emacs C code heavily uses macros defined in lisp.h. So suppose | |
141 | we want the address of the l-value expression near the bottom of | |
142 | `add_command_key' from keyboard.c: | |
143 | ||
144 | XVECTOR (this_command_keys)->contents[this_command_key_count++] = key; | |
145 | ||
146 | XVECTOR is a macro, and therefore GDB does not know about it. | |
147 | GDB cannot evaluate "p XVECTOR (this_command_keys)". | |
148 | ||
149 | However, you can use the xvector command in GDB to get the same | |
150 | result. Here is how: | |
151 | ||
152 | (gdb) p this_command_keys | |
153 | $1 = 1078005760 | |
154 | (gdb) xvector | |
155 | $2 = (struct Lisp_Vector *) 0x411000 | |
156 | 0 | |
157 | (gdb) p $->contents[this_command_key_count] | |
158 | $3 = 1077872640 | |
159 | (gdb) p &$ | |
160 | $4 = (int *) 0x411008 | |
161 | ||
162 | Here's a related example of macros and the GDB `define' command. | |
163 | There are many Lisp vectors such as `recent_keys', which contains the | |
164 | last 100 keystrokes. We can print this Lisp vector | |
165 | ||
166 | p recent_keys | |
167 | pr | |
168 | ||
169 | But this may be inconvenient, since `recent_keys' is much more verbose | |
170 | than `C-h l'. We might want to print only the last 10 elements of | |
171 | this vector. `recent_keys' is updated in keyboard.c by the command | |
172 | ||
173 | XVECTOR (recent_keys)->contents[recent_keys_index] = c; | |
174 | ||
175 | So we define a GDB command `xvector-elts', so the last 10 keystrokes | |
176 | are printed by | |
177 | ||
178 | xvector-elts recent_keys recent_keys_index 10 | |
179 | ||
180 | where you can define xvector-elts as follows: | |
181 | ||
182 | define xvector-elts | |
183 | set $i = 0 | |
184 | p $arg0 | |
185 | xvector | |
186 | set $foo = $ | |
187 | while $i < $arg2 | |
188 | p $foo->contents[$arg1-($i++)] | |
189 | pr | |
190 | end | |
191 | document xvector-elts | |
192 | Prints a range of elements of a Lisp vector. | |
193 | xvector-elts v n i | |
194 | prints `i' elements of the vector `v' ending at the index `n'. | |
195 | end | |
196 | ||
197 | ** Getting Lisp-level backtrace information within GDB | |
198 | ||
3102e429 EZ |
199 | The most convenient way is to use the `xbacktrace' command. This |
200 | shows the names of the Lisp functions that are currently active. | |
437368fe EZ |
201 | |
202 | If that doesn't work (e.g., because the `backtrace_list' structure is | |
203 | corrupted), type "bt" at the GDB prompt, to produce the C-level | |
204 | backtrace, and look for stack frames that call Ffuncall. Select them | |
205 | one by one in GDB, by typing "up N", where N is the appropriate number | |
206 | of frames to go up, and in each frame that calls Ffuncall type this: | |
207 | ||
208 | p *args | |
209 | pr | |
210 | ||
211 | This will print the name of the Lisp function called by that level | |
212 | of function calling. | |
213 | ||
214 | By printing the remaining elements of args, you can see the argument | |
215 | values. Here's how to print the first argument: | |
216 | ||
217 | p args[1] | |
218 | pr | |
219 | ||
220 | If you do not have a live process, you can use xtype and the other | |
221 | x... commands such as xsymbol to get such information, albeit less | |
222 | conveniently. For example: | |
223 | ||
224 | p *args | |
225 | xtype | |
226 | ||
227 | and, assuming that "xtype" says that args[0] is a symbol: | |
228 | ||
229 | xsymbol | |
230 | ||
231 | ** Debugging what happens while preloading and dumping Emacs | |
232 | ||
233 | Type `gdb temacs' and start it with `r -batch -l loadup dump'. | |
234 | ||
235 | If temacs actually succeeds when running under GDB in this way, do not | |
236 | try to run the dumped Emacs, because it was dumped with the GDB | |
237 | breakpoints in it. | |
238 | ||
239 | ** Debugging `temacs' | |
240 | ||
241 | Debugging `temacs' is useful when you want to establish whether a | |
242 | problem happens in an undumped Emacs. To run `temacs' under a | |
243 | debugger, type "gdb temacs", then start it with `r -batch -l loadup'. | |
244 | ||
245 | ** If you encounter X protocol errors | |
246 | ||
247 | Try evaluating (x-synchronize t). That puts Emacs into synchronous | |
248 | mode, where each Xlib call checks for errors before it returns. This | |
249 | mode is much slower, but when you get an error, you will see exactly | |
250 | which call really caused the error. | |
251 | ||
252 | ** If the symptom of the bug is that Emacs fails to respond | |
253 | ||
254 | Don't assume Emacs is `hung'--it may instead be in an infinite loop. | |
255 | To find out which, make the problem happen under GDB and stop Emacs | |
256 | once it is not responding. (If Emacs is using X Windows directly, you | |
257 | can stop Emacs by typing C-z at the GDB job.) Then try stepping with | |
258 | `step'. If Emacs is hung, the `step' command won't return. If it is | |
259 | looping, `step' will return. | |
260 | ||
261 | If this shows Emacs is hung in a system call, stop it again and | |
262 | examine the arguments of the call. If you report the bug, it is very | |
263 | important to state exactly where in the source the system call is, and | |
264 | what the arguments are. | |
265 | ||
266 | If Emacs is in an infinite loop, try to determine where the loop | |
267 | starts and ends. The easiest way to do this is to use the GDB command | |
268 | `finish'. Each time you use it, Emacs resumes execution until it | |
269 | exits one stack frame. Keep typing `finish' until it doesn't | |
270 | return--that means the infinite loop is in the stack frame which you | |
271 | just tried to finish. | |
272 | ||
273 | Stop Emacs again, and use `finish' repeatedly again until you get back | |
274 | to that frame. Then use `next' to step through that frame. By | |
275 | stepping, you will see where the loop starts and ends. Also, examine | |
276 | the data being used in the loop and try to determine why the loop does | |
277 | not exit when it should. | |
278 | ||
279 | ** If certain operations in Emacs are slower than they used to be, here | |
280 | is some advice for how to find out why. | |
281 | ||
282 | Stop Emacs repeatedly during the slow operation, and make a backtrace | |
283 | each time. Compare the backtraces looking for a pattern--a specific | |
284 | function that shows up more often than you'd expect. | |
285 | ||
286 | If you don't see a pattern in the C backtraces, get some Lisp | |
287 | backtrace information by typing "xbacktrace" or by looking at Ffuncall | |
288 | frames (see above), and again look for a pattern. | |
289 | ||
290 | When using X, you can stop Emacs at any time by typing C-z at GDB. | |
291 | When not using X, you can do this with C-g. On non-Unix platforms, | |
292 | such as MS-DOS, you might need to press C-BREAK instead. | |
293 | ||
a933dad1 DL |
294 | ** If GDB does not run and your debuggers can't load Emacs. |
295 | ||
296 | On some systems, no debugger can load Emacs with a symbol table, | |
297 | perhaps because they all have fixed limits on the number of symbols | |
298 | and Emacs exceeds the limits. Here is a method that can be used | |
299 | in such an extremity. Do | |
300 | ||
301 | nm -n temacs > nmout | |
302 | strip temacs | |
303 | adb temacs | |
304 | 0xd:i | |
305 | 0xe:i | |
306 | 14:i | |
307 | 17:i | |
308 | :r -l loadup (or whatever) | |
309 | ||
310 | It is necessary to refer to the file `nmout' to convert | |
311 | numeric addresses into symbols and vice versa. | |
312 | ||
313 | It is useful to be running under a window system. | |
314 | Then, if Emacs becomes hopelessly wedged, you can create | |
315 | another window to do kill -9 in. kill -ILL is often | |
316 | useful too, since that may make Emacs dump core or return | |
317 | to adb. | |
318 | ||
319 | ||
320 | ** Debugging incorrect screen updating. | |
321 | ||
322 | To debug Emacs problems that update the screen wrong, it is useful | |
323 | to have a record of what input you typed and what Emacs sent to the | |
324 | screen. To make these records, do | |
325 | ||
326 | (open-dribble-file "~/.dribble") | |
327 | (open-termscript "~/.termscript") | |
328 | ||
329 | The dribble file contains all characters read by Emacs from the | |
330 | terminal, and the termscript file contains all characters it sent to | |
331 | the terminal. The use of the directory `~/' prevents interference | |
332 | with any other user. | |
333 | ||
334 | If you have irreproducible display problems, put those two expressions | |
335 | in your ~/.emacs file. When the problem happens, exit the Emacs that | |
336 | you were running, kill it, and rename the two files. Then you can start | |
337 | another Emacs without clobbering those files, and use it to examine them. | |
125f929e MB |
338 | |
339 | An easy way to see if too much text is being redrawn on a terminal is to | |
340 | evaluate `(setq inverse-video t)' before you try the operation you think | |
341 | will cause too much redrawing. This doesn't refresh the screen, so only | |
342 | newly drawn text is in inverse video. | |
437368fe EZ |
343 | |
344 | ||
345 | ** Debugging LessTif | |
346 | ||
347 | If you encounter bugs whereby Emacs built with LessTif grabs all mouse | |
348 | and keyboard events, or LessTif menus behave weirdly, it might be | |
349 | helpful to set the `DEBUGSOURCES' and `DEBUG_FILE' environment | |
350 | variables, so that one can see what LessTif was doing at this point. | |
351 | For instance | |
352 | ||
353 | export DEBUGSOURCES="RowColumn.c MenuShell.c MenuUtil.c" | |
354 | export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE | |
2aa25884 | 355 | emacs & |
437368fe EZ |
356 | |
357 | causes LessTif to print traces from the three named source files to a | |
2aa25884 EZ |
358 | file in `/usr/tmp' (that file can get pretty large). The above should |
359 | be typed at the shell prompt before invoking Emacs, as shown by the | |
360 | last line above. | |
437368fe EZ |
361 | |
362 | Running GDB from another terminal could also help with such problems. | |
363 | You can arrange for GDB to run on one machine, with the Emacs display | |
364 | appearing on another. Then, when the bug happens, you can go back to | |
365 | the machine where you started GDB and use the debugger from there. | |
366 | ||
367 | ||
368 | ** Running Emacs with Purify | |
369 | ||
3102e429 | 370 | Some people who are willing to use non-free software use Purify. We |
8e92a96a EZ |
371 | can't ethically ask you to become a Purify user; but if you have it, |
372 | and you test Emacs with it, we will not refuse to look at the results | |
373 | you find. | |
3102e429 | 374 | |
437368fe EZ |
375 | Emacs compiled with Purify won't run without some hacking. Here are |
376 | some of the changes you might find necessary (SYSTEM-NAME and | |
377 | MACHINE-NAME are the names of your OS- and CPU-specific headers in the | |
378 | subdirectories of `src'): | |
379 | ||
380 | - In src/s/SYSTEM-NAME.h add "#define SYSTEM_MALLOC". | |
381 | ||
382 | - In src/m/MACHINE-NAME.h add "#define CANNOT_DUMP" and | |
383 | "#define CANNOT_UNEXEC". | |
384 | ||
385 | - Configure with a different --prefix= option. If you use GCC, | |
386 | version 2.7.2 is preferred, as Purify works a lot better with it | |
387 | than with 2.95 or later versions. | |
388 | ||
389 | - Type "make" then "make -k install". You might need to run | |
8e92a96a | 390 | "make -k install" twice. |
437368fe EZ |
391 | |
392 | - cd src; purify -chain-length=40 gcc <link command line for temacs> | |
393 | ||
394 | - cd ..; src/temacs | |
395 | ||
396 | Note that Purify might print lots of false alarms for bitfields used | |
397 | by Emacs in some data structures. If you want to get rid of the false | |
398 | alarms, you will have to hack the definitions of these data structures | |
8e92a96a | 399 | on the respective headers to remove the `:N' bitfield definitions |
437368fe EZ |
400 | (which will cause each such field to use a full int). |
401 | ||
437368fe EZ |
402 | ** Debugging problems which happen in GC |
403 | ||
404 | The array `last_marked' (defined on alloc.c) can be used to display | |
405 | up to 500 last objects marked by the garbage collection process. The | |
406 | variable `last_marked_index' holds the index into the `last_marked' | |
407 | array one place beyond where the very last marked object is stored. | |
408 | ||
409 | The single most important goal in debugging GC problems is to find the | |
410 | Lisp data structure that got corrupted. This is not easy since GC | |
411 | changes the tag bits and relocates strings which make it hard to look | |
412 | at Lisp objects with commands such as `pr'. It is sometimes necessary | |
413 | to convert Lisp_Object variables into pointers to C struct's manually. | |
414 | Use the `last_marked' array and the source to reconstruct the sequence | |
415 | that objects were marked. | |
416 | ||
417 | Once you discover the corrupted Lisp object or data structure, it is | |
8e92a96a EZ |
418 | useful to look at it in a fresh Emacs session and compare its contents |
419 | with a session that you are debugging. | |
437368fe | 420 | |
437368fe EZ |
421 | ** Some suggestions for debugging on MS Windows: |
422 | ||
423 | (written by Marc Fleischeuers, Geoff Voelker and Andrew Innes) | |
424 | ||
3102e429 | 425 | To debug Emacs with Microsoft Visual C++, you either start emacs from |
437368fe EZ |
426 | the debugger or attach the debugger to a running emacs process. To |
427 | start emacs from the debugger, you can use the file bin/debug.bat. The | |
428 | Microsoft Developer studio will start and under Project, Settings, | |
3102e429 | 429 | Debug, General you can set the command-line arguments and Emacs's |
437368fe EZ |
430 | startup directory. Set breakpoints (Edit, Breakpoints) at Fsignal and |
431 | other functions that you want to examine. Run the program (Build, | |
432 | Start debug). Emacs will start and the debugger will take control as | |
433 | soon as a breakpoint is hit. | |
434 | ||
3102e429 | 435 | You can also attach the debugger to an already running Emacs process. |
437368fe EZ |
436 | To do this, start up the Microsoft Developer studio and select Build, |
437 | Start debug, Attach to process. Choose the Emacs process from the | |
438 | list. Send a break to the running process (Debug, Break) and you will | |
439 | find that execution is halted somewhere in user32.dll. Open the stack | |
440 | trace window and go up the stack to w32_msg_pump. Now you can set | |
441 | breakpoints in Emacs (Edit, Breakpoints). Continue the running Emacs | |
442 | process (Debug, Step out) and control will return to Emacs, until a | |
443 | breakpoint is hit. | |
444 | ||
3102e429 | 445 | To examine the contents of a Lisp variable, you can use the function |
437368fe EZ |
446 | 'debug_print'. Right-click on a variable, select QuickWatch (it has |
447 | an eyeglass symbol on its button in the toolbar), and in the text | |
448 | field at the top of the window, place 'debug_print(' and ')' around | |
449 | the expression. Press 'Recalculate' and the output is sent to stderr, | |
450 | and to the debugger via the OutputDebugString routine. The output | |
451 | sent to stderr should be displayed in the console window that was | |
452 | opened when the emacs.exe executable was started. The output sent to | |
453 | the debugger should be displayed in the 'Debug' pane in the Output | |
454 | window. If Emacs was started from the debugger, a console window was | |
455 | opened at Emacs' startup; this console window also shows the output of | |
456 | 'debug_print'. | |
457 | ||
458 | For example, start and run Emacs in the debugger until it is waiting | |
459 | for user input. Then click on the `Break' button in the debugger to | |
460 | halt execution. Emacs should halt in `ZwUserGetMessage' waiting for | |
461 | an input event. Use the `Call Stack' window to select the procedure | |
462 | `w32_msp_pump' up the call stack (see below for why you have to do | |
463 | this). Open the QuickWatch window and enter | |
464 | "debug_print(Vexec_path)". Evaluating this expression will then print | |
3102e429 | 465 | out the contents of the Lisp variable `exec-path'. |
437368fe EZ |
466 | |
467 | If QuickWatch reports that the symbol is unknown, then check the call | |
468 | stack in the `Call Stack' window. If the selected frame in the call | |
469 | stack is not an Emacs procedure, then the debugger won't recognize | |
470 | Emacs symbols. Instead, select a frame that is inside an Emacs | |
471 | procedure and try using `debug_print' again. | |
472 | ||
473 | If QuickWatch invokes debug_print but nothing happens, then check the | |
474 | thread that is selected in the debugger. If the selected thread is | |
475 | not the last thread to run (the "current" thread), then it cannot be | |
476 | used to execute debug_print. Use the Debug menu to select the current | |
477 | thread and try using debug_print again. Note that the debugger halts | |
478 | execution (e.g., due to a breakpoint) in the context of the current | |
479 | thread, so this should only be a problem if you've explicitly switched | |
480 | threads. | |
481 | ||
3102e429 | 482 | It is also possible to keep appropriately masked and typecast Lisp |
437368fe EZ |
483 | symbols in the Watch window, this is more convenient when steeping |
484 | though the code. For instance, on entering apply_lambda, you can | |
485 | watch (struct Lisp_Symbol *) (0xfffffff & args[0]). |