| 1 | Debugging GNU Emacs |
| 2 | Copyright (c) 1985, 2000, 2001 Free Software Foundation, Inc. |
| 3 | |
| 4 | Permission is granted to anyone to make or distribute verbatim copies |
| 5 | of this document as received, in any medium, provided that the |
| 6 | copyright notice and permission notice are preserved, |
| 7 | and that the distributor grants the recipient permission |
| 8 | for further redistribution as permitted by this notice. |
| 9 | |
| 10 | Permission is granted to distribute modified versions |
| 11 | of this document, or of portions of it, |
| 12 | under the above conditions, provided also that they |
| 13 | carry prominent notices stating who last changed them. |
| 14 | |
| 15 | [People who debug Emacs on Windows using native Windows debuggers |
| 16 | should read the Windows-specific section near the end of this |
| 17 | document.] |
| 18 | |
| 19 | It is a good idea to run Emacs under GDB (or some other suitable |
| 20 | debugger) *all the time*. Then, when Emacs crashes, you will be able |
| 21 | to debug the live process, not just a core dump. (This is especially |
| 22 | important on systems which don't support core files, and instead print |
| 23 | just the registers and some stack addresses.) |
| 24 | |
| 25 | If Emacs hangs, or seems to be stuck in some infinite loop, typing |
| 26 | "kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to |
| 27 | kick in, provided that you run under GDB. |
| 28 | |
| 29 | ** Getting control to the debugger |
| 30 | |
| 31 | `Fsignal' is a very useful place to put a breakpoint in. |
| 32 | All Lisp errors go through there. |
| 33 | |
| 34 | It is useful, when debugging, to have a guaranteed way to return to |
| 35 | the debugger at any time. When using X, this is easy: type C-c at the |
| 36 | window where Emacs is running under GDB, and it will stop Emacs just |
| 37 | as it would stop any ordinary program. When Emacs is running in a |
| 38 | terminal, things are not so easy. |
| 39 | |
| 40 | The src/.gdbinit file in the Emacs distribution arranges for SIGINT |
| 41 | (C-g in Emacs) to be passed to Emacs and not give control back to GDB. |
| 42 | On modern POSIX systems, you can override that with this command: |
| 43 | |
| 44 | handle int stop nopass |
| 45 | |
| 46 | After this `handle' command, SIGINT will return control to GDB. If |
| 47 | you want the C-g to cause a QUIT within Emacs as well, omit the |
| 48 | `nopass'. |
| 49 | |
| 50 | A technique that can work when `handle SIGINT' does not is to store |
| 51 | the code for some character into the variable stop_character. Thus, |
| 52 | |
| 53 | set stop_character = 29 |
| 54 | |
| 55 | makes Control-] (decimal code 29) the stop character. |
| 56 | Typing Control-] will cause immediate stop. You cannot |
| 57 | use the set command until the inferior process has been started. |
| 58 | Put a breakpoint early in `main', or suspend the Emacs, |
| 59 | to get an opportunity to do the set command. |
| 60 | |
| 61 | ** Examining Lisp object values. |
| 62 | |
| 63 | When you have a live process to debug, and it has not encountered a |
| 64 | fatal error, you can use the GDB command `pr'. First print the value |
| 65 | in the ordinary way, with the `p' command. Then type `pr' with no |
| 66 | arguments. This calls a subroutine which uses the Lisp printer. |
| 67 | |
| 68 | Note: It is not a good idea to try `pr' if you know that Emacs is in |
| 69 | deep trouble: its stack smashed (e.g., if it encountered SIGSEGV due |
| 70 | to stack overflow), or crucial data structures, such as `obarray', |
| 71 | corrupted, etc. In such cases, the Emacs subroutine called by `pr' |
| 72 | might make more damage, like overwrite some data that is important for |
| 73 | debugging the original problem. |
| 74 | |
| 75 | Also, on some systems it is impossible to use `pr' if you stopped |
| 76 | Emacs while it was inside `select'. This is in fact what happens if |
| 77 | you stop Emacs while it is waiting. In such a situation, don't try to |
| 78 | use `pr'. Instead, use `s' to step out of the system call. Then |
| 79 | Emacs will be between instructions and capable of handling `pr'. |
| 80 | |
| 81 | If you can't use `pr' command, for whatever reason, you can fall back |
| 82 | on lower-level commands. Use the `xtype' command to print out the |
| 83 | data type of the last data value. Once you know the data type, use |
| 84 | the command that corresponds to that type. Here are these commands: |
| 85 | |
| 86 | xint xptr xwindow xmarker xoverlay xmiscfree xintfwd xboolfwd xobjfwd |
| 87 | xbufobjfwd xkbobjfwd xbuflocal xbuffer xsymbol xstring xvector xframe |
| 88 | xwinconfig xcompiled xcons xcar xcdr xsubr xprocess xfloat xscrollbar |
| 89 | |
| 90 | Each one of them applies to a certain type or class of types. |
| 91 | (Some of these types are not visible in Lisp, because they exist only |
| 92 | internally.) |
| 93 | |
| 94 | Each x... command prints some information about the value, and |
| 95 | produces a GDB value (subsequently available in $) through which you |
| 96 | can get at the rest of the contents. |
| 97 | |
| 98 | In general, most of the rest of the contents will be additional Lisp |
| 99 | objects which you can examine in turn with the x... commands. |
| 100 | |
| 101 | Even with a live process, these x... commands are useful for |
| 102 | examining the fields in a buffer, window, process, frame or marker. |
| 103 | Here's an example using concepts explained in the node "Value History" |
| 104 | of the GDB manual to print the variable frame from this line in |
| 105 | xmenu.c: |
| 106 | |
| 107 | buf.frame_or_window = frame; |
| 108 | |
| 109 | First, use these commands: |
| 110 | |
| 111 | cd src |
| 112 | gdb emacs |
| 113 | b xmenu.c:1296 |
| 114 | r -q |
| 115 | |
| 116 | Then type C-x 5 2 to create a new frame, and it hits the breakpoint: |
| 117 | |
| 118 | (gdb) p frame |
| 119 | $1 = 1077872640 |
| 120 | (gdb) xtype |
| 121 | Lisp_Vectorlike |
| 122 | PVEC_FRAME |
| 123 | (gdb) xframe |
| 124 | $2 = (struct frame *) 0x3f0800 |
| 125 | (gdb) p *$ |
| 126 | $3 = { |
| 127 | size = 536871989, |
| 128 | next = 0x366240, |
| 129 | name = 809661752, |
| 130 | [...] |
| 131 | } |
| 132 | (gdb) p $3->name |
| 133 | $4 = 809661752 |
| 134 | |
| 135 | Now we can use `pr' to print the name of the frame: |
| 136 | |
| 137 | (gdb) pr |
| 138 | "emacs@steenrod.math.nwu.edu" |
| 139 | |
| 140 | The Emacs C code heavily uses macros defined in lisp.h. So suppose |
| 141 | we want the address of the l-value expression near the bottom of |
| 142 | `add_command_key' from keyboard.c: |
| 143 | |
| 144 | XVECTOR (this_command_keys)->contents[this_command_key_count++] = key; |
| 145 | |
| 146 | XVECTOR is a macro, and therefore GDB does not know about it. |
| 147 | GDB cannot evaluate "p XVECTOR (this_command_keys)". |
| 148 | |
| 149 | However, you can use the xvector command in GDB to get the same |
| 150 | result. Here is how: |
| 151 | |
| 152 | (gdb) p this_command_keys |
| 153 | $1 = 1078005760 |
| 154 | (gdb) xvector |
| 155 | $2 = (struct Lisp_Vector *) 0x411000 |
| 156 | 0 |
| 157 | (gdb) p $->contents[this_command_key_count] |
| 158 | $3 = 1077872640 |
| 159 | (gdb) p &$ |
| 160 | $4 = (int *) 0x411008 |
| 161 | |
| 162 | Here's a related example of macros and the GDB `define' command. |
| 163 | There are many Lisp vectors such as `recent_keys', which contains the |
| 164 | last 100 keystrokes. We can print this Lisp vector |
| 165 | |
| 166 | p recent_keys |
| 167 | pr |
| 168 | |
| 169 | But this may be inconvenient, since `recent_keys' is much more verbose |
| 170 | than `C-h l'. We might want to print only the last 10 elements of |
| 171 | this vector. `recent_keys' is updated in keyboard.c by the command |
| 172 | |
| 173 | XVECTOR (recent_keys)->contents[recent_keys_index] = c; |
| 174 | |
| 175 | So we define a GDB command `xvector-elts', so the last 10 keystrokes |
| 176 | are printed by |
| 177 | |
| 178 | xvector-elts recent_keys recent_keys_index 10 |
| 179 | |
| 180 | where you can define xvector-elts as follows: |
| 181 | |
| 182 | define xvector-elts |
| 183 | set $i = 0 |
| 184 | p $arg0 |
| 185 | xvector |
| 186 | set $foo = $ |
| 187 | while $i < $arg2 |
| 188 | p $foo->contents[$arg1-($i++)] |
| 189 | pr |
| 190 | end |
| 191 | document xvector-elts |
| 192 | Prints a range of elements of a Lisp vector. |
| 193 | xvector-elts v n i |
| 194 | prints `i' elements of the vector `v' ending at the index `n'. |
| 195 | end |
| 196 | |
| 197 | ** Getting Lisp-level backtrace information within GDB |
| 198 | |
| 199 | The most convenient way is to use the `xbacktrace' command. This |
| 200 | shows the names of the Lisp functions that are currently active. |
| 201 | |
| 202 | If that doesn't work (e.g., because the `backtrace_list' structure is |
| 203 | corrupted), type "bt" at the GDB prompt, to produce the C-level |
| 204 | backtrace, and look for stack frames that call Ffuncall. Select them |
| 205 | one by one in GDB, by typing "up N", where N is the appropriate number |
| 206 | of frames to go up, and in each frame that calls Ffuncall type this: |
| 207 | |
| 208 | p *args |
| 209 | pr |
| 210 | |
| 211 | This will print the name of the Lisp function called by that level |
| 212 | of function calling. |
| 213 | |
| 214 | By printing the remaining elements of args, you can see the argument |
| 215 | values. Here's how to print the first argument: |
| 216 | |
| 217 | p args[1] |
| 218 | pr |
| 219 | |
| 220 | If you do not have a live process, you can use xtype and the other |
| 221 | x... commands such as xsymbol to get such information, albeit less |
| 222 | conveniently. For example: |
| 223 | |
| 224 | p *args |
| 225 | xtype |
| 226 | |
| 227 | and, assuming that "xtype" says that args[0] is a symbol: |
| 228 | |
| 229 | xsymbol |
| 230 | |
| 231 | ** Debugging what happens while preloading and dumping Emacs |
| 232 | |
| 233 | Type `gdb temacs' and start it with `r -batch -l loadup dump'. |
| 234 | |
| 235 | If temacs actually succeeds when running under GDB in this way, do not |
| 236 | try to run the dumped Emacs, because it was dumped with the GDB |
| 237 | breakpoints in it. |
| 238 | |
| 239 | ** Debugging `temacs' |
| 240 | |
| 241 | Debugging `temacs' is useful when you want to establish whether a |
| 242 | problem happens in an undumped Emacs. To run `temacs' under a |
| 243 | debugger, type "gdb temacs", then start it with `r -batch -l loadup'. |
| 244 | |
| 245 | ** If you encounter X protocol errors |
| 246 | |
| 247 | Try evaluating (x-synchronize t). That puts Emacs into synchronous |
| 248 | mode, where each Xlib call checks for errors before it returns. This |
| 249 | mode is much slower, but when you get an error, you will see exactly |
| 250 | which call really caused the error. |
| 251 | |
| 252 | You can start Emacs in a synchronous mode by invoking it with the -xrm |
| 253 | option, like this: |
| 254 | |
| 255 | emacs -xrm "emacs.synchronous: true" |
| 256 | |
| 257 | Setting a breakpoint in the function `x_error_quitter' and looking at |
| 258 | the backtrace when Emacs stops inside that function will show what |
| 259 | code causes the X protocol errors. |
| 260 | |
| 261 | Some bugs related to the X protocol disappear when Emacs runs in a |
| 262 | synchronous mode. To track down those bugs, we suggest the following |
| 263 | procedure: |
| 264 | |
| 265 | - Run Emacs under a debugger and put a breakpoint inside the |
| 266 | primitive function which, when called from Lisp, triggers the X |
| 267 | protocol errors. For example, if the errors happen when you |
| 268 | delete a frame, put a breakpoint inside `Fdelete_frame'. |
| 269 | |
| 270 | - When the breakpoint breaks, step through the code, looking for |
| 271 | calls to X functions (the ones whose names begin with "X" or |
| 272 | "Xt" or "Xm"). |
| 273 | |
| 274 | - Insert calls to `XSync' before and after each call to the X |
| 275 | functions, like this: |
| 276 | |
| 277 | XSync (f->output_data.x->display_info->display, 0); |
| 278 | |
| 279 | where `f' is the pointer to the `struct frame' of the selected |
| 280 | frame, normally available via XFRAME (selected_frame). (Most |
| 281 | functions which call X already have some variable that holds the |
| 282 | pointer to the frame, perhaps called `f' or `sf', so you shouldn't |
| 283 | need to compute it.) |
| 284 | |
| 285 | If your debugger can call functions in the program being debugged, |
| 286 | you should be able to issue the calls to `XSync' without recompiling |
| 287 | Emacs. For example, with GDB, just type: |
| 288 | |
| 289 | call XSync (f->output_data.x->display_info->display, 0) |
| 290 | |
| 291 | before and immediately after the suspect X calls. If your |
| 292 | debugger does not support this, you will need to add these pairs |
| 293 | of calls in the source and rebuild Emacs. |
| 294 | |
| 295 | Either way, systematically step through the code and issue these |
| 296 | calls until you find the first X function called by Emacs after |
| 297 | which a call to `XSync' winds up in the function |
| 298 | `x_error_quitter'. The first X function call for which this |
| 299 | happens is the one that generated the X protocol error. |
| 300 | |
| 301 | - You should now look around this offending X call and try to figure |
| 302 | out what is wrong with it. |
| 303 | |
| 304 | ** If the symptom of the bug is that Emacs fails to respond |
| 305 | |
| 306 | Don't assume Emacs is `hung'--it may instead be in an infinite loop. |
| 307 | To find out which, make the problem happen under GDB and stop Emacs |
| 308 | once it is not responding. (If Emacs is using X Windows directly, you |
| 309 | can stop Emacs by typing C-z at the GDB job.) Then try stepping with |
| 310 | `step'. If Emacs is hung, the `step' command won't return. If it is |
| 311 | looping, `step' will return. |
| 312 | |
| 313 | If this shows Emacs is hung in a system call, stop it again and |
| 314 | examine the arguments of the call. If you report the bug, it is very |
| 315 | important to state exactly where in the source the system call is, and |
| 316 | what the arguments are. |
| 317 | |
| 318 | If Emacs is in an infinite loop, try to determine where the loop |
| 319 | starts and ends. The easiest way to do this is to use the GDB command |
| 320 | `finish'. Each time you use it, Emacs resumes execution until it |
| 321 | exits one stack frame. Keep typing `finish' until it doesn't |
| 322 | return--that means the infinite loop is in the stack frame which you |
| 323 | just tried to finish. |
| 324 | |
| 325 | Stop Emacs again, and use `finish' repeatedly again until you get back |
| 326 | to that frame. Then use `next' to step through that frame. By |
| 327 | stepping, you will see where the loop starts and ends. Also, examine |
| 328 | the data being used in the loop and try to determine why the loop does |
| 329 | not exit when it should. |
| 330 | |
| 331 | ** If certain operations in Emacs are slower than they used to be, here |
| 332 | is some advice for how to find out why. |
| 333 | |
| 334 | Stop Emacs repeatedly during the slow operation, and make a backtrace |
| 335 | each time. Compare the backtraces looking for a pattern--a specific |
| 336 | function that shows up more often than you'd expect. |
| 337 | |
| 338 | If you don't see a pattern in the C backtraces, get some Lisp |
| 339 | backtrace information by typing "xbacktrace" or by looking at Ffuncall |
| 340 | frames (see above), and again look for a pattern. |
| 341 | |
| 342 | When using X, you can stop Emacs at any time by typing C-z at GDB. |
| 343 | When not using X, you can do this with C-g. On non-Unix platforms, |
| 344 | such as MS-DOS, you might need to press C-BREAK instead. |
| 345 | |
| 346 | ** If GDB does not run and your debuggers can't load Emacs. |
| 347 | |
| 348 | On some systems, no debugger can load Emacs with a symbol table, |
| 349 | perhaps because they all have fixed limits on the number of symbols |
| 350 | and Emacs exceeds the limits. Here is a method that can be used |
| 351 | in such an extremity. Do |
| 352 | |
| 353 | nm -n temacs > nmout |
| 354 | strip temacs |
| 355 | adb temacs |
| 356 | 0xd:i |
| 357 | 0xe:i |
| 358 | 14:i |
| 359 | 17:i |
| 360 | :r -l loadup (or whatever) |
| 361 | |
| 362 | It is necessary to refer to the file `nmout' to convert |
| 363 | numeric addresses into symbols and vice versa. |
| 364 | |
| 365 | It is useful to be running under a window system. |
| 366 | Then, if Emacs becomes hopelessly wedged, you can create |
| 367 | another window to do kill -9 in. kill -ILL is often |
| 368 | useful too, since that may make Emacs dump core or return |
| 369 | to adb. |
| 370 | |
| 371 | |
| 372 | ** Debugging incorrect screen updating. |
| 373 | |
| 374 | To debug Emacs problems that update the screen wrong, it is useful |
| 375 | to have a record of what input you typed and what Emacs sent to the |
| 376 | screen. To make these records, do |
| 377 | |
| 378 | (open-dribble-file "~/.dribble") |
| 379 | (open-termscript "~/.termscript") |
| 380 | |
| 381 | The dribble file contains all characters read by Emacs from the |
| 382 | terminal, and the termscript file contains all characters it sent to |
| 383 | the terminal. The use of the directory `~/' prevents interference |
| 384 | with any other user. |
| 385 | |
| 386 | If you have irreproducible display problems, put those two expressions |
| 387 | in your ~/.emacs file. When the problem happens, exit the Emacs that |
| 388 | you were running, kill it, and rename the two files. Then you can start |
| 389 | another Emacs without clobbering those files, and use it to examine them. |
| 390 | |
| 391 | An easy way to see if too much text is being redrawn on a terminal is to |
| 392 | evaluate `(setq inverse-video t)' before you try the operation you think |
| 393 | will cause too much redrawing. This doesn't refresh the screen, so only |
| 394 | newly drawn text is in inverse video. |
| 395 | |
| 396 | The Emacs display code includes special debugging code, but it is |
| 397 | normally disabled. You can enable it by building Emacs with the |
| 398 | pre-processing symbol GLYPH_DEBUG defined. Here's one easy way, |
| 399 | suitable for Unix and GNU systems, to build such a debugging version: |
| 400 | |
| 401 | MYCPPFLAGS='-DGLYPH_DEBUG=1' make |
| 402 | |
| 403 | Building Emacs like that activates many assertions which scrutinize |
| 404 | display code operation more than Emacs does normally. (To see the |
| 405 | code which tests these assertions, look for calls to the `xassert' |
| 406 | macros.) Any assertion that is reported to fail should be |
| 407 | investigated. |
| 408 | |
| 409 | Building with GLYPH_DEBUG defined also defines several helper |
| 410 | functions which can help debugging display code. One such function is |
| 411 | `dump_glyph_matrix'. If you run Emacs under GDB, you can print the |
| 412 | contents of any glyph matrix by just calling that function with the |
| 413 | matrix as its argument. For example, the following command will print |
| 414 | the contents of the current matrix of the window whose pointer is in |
| 415 | `w': |
| 416 | |
| 417 | (gdb) p dump_glyph_matrix (w->current_matrix, 2) |
| 418 | |
| 419 | (The second argument 2 tells dump_glyph_matrix to print the glyphs in |
| 420 | a long form.) You can dump the selected window's current glyph matrix |
| 421 | interactively with "M-x dump-glyph-matrix RET"; see the documentation |
| 422 | of this function for more details. |
| 423 | |
| 424 | Several more functions for debugging display code are available in |
| 425 | Emacs compiled with GLYPH_DEBUG defined; type "C-h f dump- TAB" and |
| 426 | "C-h f trace- TAB" to see the full list. |
| 427 | |
| 428 | |
| 429 | ** Debugging LessTif |
| 430 | |
| 431 | If you encounter bugs whereby Emacs built with LessTif grabs all mouse |
| 432 | and keyboard events, or LessTif menus behave weirdly, it might be |
| 433 | helpful to set the `DEBUGSOURCES' and `DEBUG_FILE' environment |
| 434 | variables, so that one can see what LessTif was doing at this point. |
| 435 | For instance |
| 436 | |
| 437 | export DEBUGSOURCES="RowColumn.c:MenuShell.c:MenuUtil.c" |
| 438 | export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE |
| 439 | emacs & |
| 440 | |
| 441 | causes LessTif to print traces from the three named source files to a |
| 442 | file in `/usr/tmp' (that file can get pretty large). The above should |
| 443 | be typed at the shell prompt before invoking Emacs, as shown by the |
| 444 | last line above. |
| 445 | |
| 446 | Running GDB from another terminal could also help with such problems. |
| 447 | You can arrange for GDB to run on one machine, with the Emacs display |
| 448 | appearing on another. Then, when the bug happens, you can go back to |
| 449 | the machine where you started GDB and use the debugger from there. |
| 450 | |
| 451 | |
| 452 | ** Debugging problems which happen in GC |
| 453 | |
| 454 | The array `last_marked' (defined on alloc.c) can be used to display up |
| 455 | to 500 last objects marked by the garbage collection process. |
| 456 | Whenever the garbage collector marks a Lisp object, it records the |
| 457 | pointer to that object in the `last_marked' array. The variable |
| 458 | `last_marked_index' holds the index into the `last_marked' array one |
| 459 | place beyond where the pointer to the very last marked object is |
| 460 | stored. |
| 461 | |
| 462 | The single most important goal in debugging GC problems is to find the |
| 463 | Lisp data structure that got corrupted. This is not easy since GC |
| 464 | changes the tag bits and relocates strings which make it hard to look |
| 465 | at Lisp objects with commands such as `pr'. It is sometimes necessary |
| 466 | to convert Lisp_Object variables into pointers to C struct's manually. |
| 467 | Use the `last_marked' array and the source to reconstruct the sequence |
| 468 | that objects were marked. |
| 469 | |
| 470 | Once you discover the corrupted Lisp object or data structure, it is |
| 471 | useful to look at it in a fresh Emacs session and compare its contents |
| 472 | with a session that you are debugging. |
| 473 | |
| 474 | ** Debugging problems with non-ASCII characters |
| 475 | |
| 476 | If you experience problems which seem to be related to non-ASCII |
| 477 | characters, such as \201 characters appearing in the buffer or in your |
| 478 | files, set the variable byte-debug-flag to t. This causes Emacs to do |
| 479 | some extra checks, such as look for broken relations between byte and |
| 480 | character positions in buffers and strings; the resulting diagnostics |
| 481 | might pinpoint the cause of the problem. |
| 482 | |
| 483 | ** Debugging the TTY (non-windowed) version |
| 484 | |
| 485 | The most convenient method of debugging the character-terminal display |
| 486 | is to do that on a window system such as X. Begin by starting an |
| 487 | xterm window, then type these commands inside that window: |
| 488 | |
| 489 | $ tty |
| 490 | $ echo $TERM |
| 491 | |
| 492 | Let's say these commands print "/dev/ttyp4" and "xterm", respectively. |
| 493 | |
| 494 | Now start Emacs (the normal, windowed-display session, i.e. without |
| 495 | the `-nw' option), and invoke "M-x gdb RET emacs RET" from there. Now |
| 496 | type these commands at GDB's prompt: |
| 497 | |
| 498 | (gdb) set args -nw -t /dev/ttyp4 |
| 499 | (gdb) set environment TERM xterm |
| 500 | (gdb) run |
| 501 | |
| 502 | The debugged Emacs should now start in no-window mode with its display |
| 503 | directed to the xterm window you opened above. |
| 504 | |
| 505 | Similar arrangement is possible on a character terminal by using the |
| 506 | `screen' package. |
| 507 | |
| 508 | ** Running Emacs built with malloc debugging packages |
| 509 | |
| 510 | If Emacs exhibits bugs that seem to be related to use of memory |
| 511 | allocated off the heap, it might be useful to link Emacs with a |
| 512 | special debugging library, such as Electric Fence (a.k.a. efence) or |
| 513 | GNU Checker, which helps find such problems. |
| 514 | |
| 515 | Emacs compiled with such packages might not run without some hacking, |
| 516 | because Emacs replaces the system's memory allocation functions with |
| 517 | its own versions, and because the dumping process might be |
| 518 | incompatible with the way these packages use to track allocated |
| 519 | memory. Here are some of the changes you might find necessary |
| 520 | (SYSTEM-NAME and MACHINE-NAME are the names of your OS- and |
| 521 | CPU-specific headers in the subdirectories of `src'): |
| 522 | |
| 523 | - In src/s/SYSTEM-NAME.h add "#define SYSTEM_MALLOC". |
| 524 | |
| 525 | - In src/m/MACHINE-NAME.h add "#define CANNOT_DUMP" and |
| 526 | "#define CANNOT_UNEXEC". |
| 527 | |
| 528 | - Configure with a different --prefix= option. If you use GCC, |
| 529 | version 2.7.2 is preferred, as some malloc debugging packages |
| 530 | work a lot better with it than with 2.95 or later versions. |
| 531 | |
| 532 | - Type "make" then "make -k install". |
| 533 | |
| 534 | - If required, invoke the package-specific command to prepare |
| 535 | src/temacs for execution. |
| 536 | |
| 537 | - cd ..; src/temacs |
| 538 | |
| 539 | (Note that this runs `temacs' instead of the usual `emacs' executable. |
| 540 | This avoids problems with dumping Emacs mentioned above.) |
| 541 | |
| 542 | Some malloc debugging libraries might print lots of false alarms for |
| 543 | bitfields used by Emacs in some data structures. If you want to get |
| 544 | rid of the false alarms, you will have to hack the definitions of |
| 545 | these data structures on the respective headers to remove the `:N' |
| 546 | bitfield definitions (which will cause each such field to use a full |
| 547 | int). |
| 548 | |
| 549 | ** Some suggestions for debugging on MS Windows: |
| 550 | |
| 551 | (written by Marc Fleischeuers, Geoff Voelker and Andrew Innes) |
| 552 | |
| 553 | To debug Emacs with Microsoft Visual C++, you either start emacs from |
| 554 | the debugger or attach the debugger to a running emacs process. |
| 555 | |
| 556 | To start emacs from the debugger, you can use the file bin/debug.bat. |
| 557 | The Microsoft Developer studio will start and under Project, Settings, |
| 558 | Debug, General you can set the command-line arguments and Emacs's |
| 559 | startup directory. Set breakpoints (Edit, Breakpoints) at Fsignal and |
| 560 | other functions that you want to examine. Run the program (Build, |
| 561 | Start debug). Emacs will start and the debugger will take control as |
| 562 | soon as a breakpoint is hit. |
| 563 | |
| 564 | You can also attach the debugger to an already running Emacs process. |
| 565 | To do this, start up the Microsoft Developer studio and select Build, |
| 566 | Start debug, Attach to process. Choose the Emacs process from the |
| 567 | list. Send a break to the running process (Debug, Break) and you will |
| 568 | find that execution is halted somewhere in user32.dll. Open the stack |
| 569 | trace window and go up the stack to w32_msg_pump. Now you can set |
| 570 | breakpoints in Emacs (Edit, Breakpoints). Continue the running Emacs |
| 571 | process (Debug, Step out) and control will return to Emacs, until a |
| 572 | breakpoint is hit. |
| 573 | |
| 574 | To examine the contents of a Lisp variable, you can use the function |
| 575 | 'debug_print'. Right-click on a variable, select QuickWatch (it has |
| 576 | an eyeglass symbol on its button in the toolbar), and in the text |
| 577 | field at the top of the window, place 'debug_print(' and ')' around |
| 578 | the expression. Press 'Recalculate' and the output is sent to stderr, |
| 579 | and to the debugger via the OutputDebugString routine. The output |
| 580 | sent to stderr should be displayed in the console window that was |
| 581 | opened when the emacs.exe executable was started. The output sent to |
| 582 | the debugger should be displayed in the 'Debug' pane in the Output |
| 583 | window. If Emacs was started from the debugger, a console window was |
| 584 | opened at Emacs' startup; this console window also shows the output of |
| 585 | 'debug_print'. |
| 586 | |
| 587 | For example, start and run Emacs in the debugger until it is waiting |
| 588 | for user input. Then click on the `Break' button in the debugger to |
| 589 | halt execution. Emacs should halt in `ZwUserGetMessage' waiting for |
| 590 | an input event. Use the `Call Stack' window to select the procedure |
| 591 | `w32_msp_pump' up the call stack (see below for why you have to do |
| 592 | this). Open the QuickWatch window and enter |
| 593 | "debug_print(Vexec_path)". Evaluating this expression will then print |
| 594 | out the contents of the Lisp variable `exec-path'. |
| 595 | |
| 596 | If QuickWatch reports that the symbol is unknown, then check the call |
| 597 | stack in the `Call Stack' window. If the selected frame in the call |
| 598 | stack is not an Emacs procedure, then the debugger won't recognize |
| 599 | Emacs symbols. Instead, select a frame that is inside an Emacs |
| 600 | procedure and try using `debug_print' again. |
| 601 | |
| 602 | If QuickWatch invokes debug_print but nothing happens, then check the |
| 603 | thread that is selected in the debugger. If the selected thread is |
| 604 | not the last thread to run (the "current" thread), then it cannot be |
| 605 | used to execute debug_print. Use the Debug menu to select the current |
| 606 | thread and try using debug_print again. Note that the debugger halts |
| 607 | execution (e.g., due to a breakpoint) in the context of the current |
| 608 | thread, so this should only be a problem if you've explicitly switched |
| 609 | threads. |
| 610 | |
| 611 | It is also possible to keep appropriately masked and typecast Lisp |
| 612 | symbols in the Watch window, this is more convenient when steeping |
| 613 | though the code. For instance, on entering apply_lambda, you can |
| 614 | watch (struct Lisp_Symbol *) (0xfffffff & args[0]). |
| 615 | |
| 616 | Optimizations often confuse the MS debugger. For example, the |
| 617 | debugger will sometimes report wrong line numbers, e.g., when it |
| 618 | prints the backtrace for a crash. It is usually best to look at the |
| 619 | disassembly to determine exactly what code is being run--the |
| 620 | disassembly will probably show several source lines followed by a |
| 621 | block of assembler for those lines. The actual point where Emacs |
| 622 | crashes will be one of those source lines, but not neccesarily the one |
| 623 | that the debugger reports. |
| 624 | |
| 625 | Another problematic area with the MS debugger is with variables that |
| 626 | are stored in registers: it will sometimes display wrong values for |
| 627 | those variables. Usually you will not be able to see any value for a |
| 628 | register variable, but if it is only being stored in a register |
| 629 | temporarily, you will see an old value for it. Again, you need to |
| 630 | look at the disassembly to determine which registers are being used, |
| 631 | and look at those registers directly, to see the actual current values |
| 632 | of these variables. |