@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990-1993, 1998-1999, 2001-2012 Free Software Foundation, Inc.
+@c Copyright (C) 1990-1993, 1998-1999, 2001-2013 Free Software
+@c Foundation, Inc.
@c See the file elisp.texi for copying conditions.
-@setfilename ../../info/internals
-@node GNU Emacs Internals, Standard Errors, Tips, Top
-@comment node-name, next, previous, up
+@node GNU Emacs Internals
@appendix GNU Emacs Internals
This chapter describes how the runnable Emacs executable is dumped with
This section explains the steps involved in building the Emacs
executable. You don't have to know this material to build and install
Emacs, since the makefiles do all these things automatically. This
-information is pertinent to Emacs maintenance.
+information is pertinent to Emacs developers.
Compilation of the C source files in the @file{src} directory
produces an executable file called @file{temacs}, also called a
-@dfn{bare impure Emacs}. It contains the Emacs Lisp interpreter and I/O
-routines, but not the editing commands.
+@dfn{bare impure Emacs}. It contains the Emacs Lisp interpreter and
+I/O routines, but not the editing commands.
@cindex @file{loadup.el}
- The command @w{@samp{temacs -l loadup}} uses @file{temacs} to create
-the real runnable Emacs executable. These arguments direct
-@file{temacs} to evaluate the Lisp files specified in the file
-@file{loadup.el}. These files set up the normal Emacs editing
-environment, resulting in an Emacs that is still impure but no longer
-bare.
+ The command @w{@command{temacs -l loadup}} would run @file{temacs}
+and direct it to load @file{loadup.el}. The @code{loadup} library
+loads additional Lisp libraries, which set up the normal Emacs editing
+environment. After this step, the Emacs executable is no longer
+@dfn{bare}.
@cindex dumping Emacs
- It takes some time to load the standard Lisp files. Luckily,
-you don't have to do this each time you run Emacs; @file{temacs} can
-dump out an executable program called @file{emacs} that has these files
-preloaded. @file{emacs} starts more quickly because it does not need to
-load the files. This is the Emacs executable that is normally
-installed.
-
+ Because it takes some time to load the standard Lisp files, the
+@file{temacs} executable usually isn't run directly by users.
+Instead, as one of the last steps of building Emacs, the command
+@samp{temacs -batch -l loadup dump} is run. The special @samp{dump}
+argument causes @command{temacs} to dump out an executable program,
+called @file{emacs}, which has all the standard Lisp files preloaded.
+(The @samp{-batch} argument prevents @file{temacs} from trying to
+initialize any of its data on the terminal, so that the tables of
+terminal information are empty in the dumped Emacs.)
+
+@cindex preloaded Lisp files
@vindex preloaded-file-list
-@cindex dumped Lisp files
- To create @file{emacs}, use the command @samp{temacs -batch -l loadup
-dump}. The purpose of @samp{-batch} here is to prevent @file{temacs}
-from trying to initialize any of its data on the terminal; this ensures
-that the tables of terminal information are empty in the dumped Emacs.
-The argument @samp{dump} tells @file{loadup.el} to dump a new executable
-named @file{emacs}. The variable @code{preloaded-file-list} stores a
-list of the Lisp files that were dumped with the @file{emacs} executable.
-
- If you port Emacs to a new operating system, and are not able to
-implement dumping, then Emacs must load @file{loadup.el} each time it
-starts.
+ The dumped @file{emacs} executable (also called a @dfn{pure} Emacs)
+is the one which is installed. The variable
+@code{preloaded-file-list} stores a list of the Lisp files preloaded
+into the dumped Emacs. If you port Emacs to a new operating system,
+and are not able to implement dumping, then Emacs must load
+@file{loadup.el} each time it starts.
@cindex @file{site-load.el}
You can specify additional files to preload by writing a library named
This function delays the initialization of @var{symbol} to the next
Emacs start. You normally use this function by specifying it as the
@code{:initialize} property of a customizable variable. (The argument
-@var{value} is unused, and is provided only for compatiblity with the
+@var{value} is unused, and is provided only for compatibility with the
form Custom expects.)
@end defun
in the preloaded standard Lisp files---data that should never change
during actual use of Emacs.
- Pure storage is allocated only while @file{temacs} is loading the
+ Pure storage is allocated only while @command{temacs} is loading the
standard preloaded Lisp libraries. In the file @file{emacs}, it is
marked as read-only (on operating systems that permit this), so that
the memory space can be shared by all the Emacs jobs running on the
@node Garbage Collection
@section Garbage Collection
-@cindex garbage collection
@cindex memory allocation
- When a program creates a list or the user defines a new function (such
-as by loading a library), that data is placed in normal storage. If
-normal storage runs low, then Emacs asks the operating system to
-allocate more memory in blocks of 1k bytes. Each block is used for one
-type of Lisp object, so symbols, cons cells, markers, etc., are
-segregated in distinct blocks in memory. (Vectors, long strings,
-buffers and certain other editing types, which are fairly large, are
-allocated in individual blocks, one per object, while small strings are
-packed into blocks of 8k bytes.)
-
- It is quite common to use some storage for a while, then release it by
-(for example) killing a buffer or deleting the last pointer to an
-object. Emacs provides a @dfn{garbage collector} to reclaim this
-abandoned storage. (This name is traditional, but ``garbage recycler''
-might be a more intuitive metaphor for this facility.)
+ When a program creates a list or the user defines a new function
+(such as by loading a library), that data is placed in normal storage.
+If normal storage runs low, then Emacs asks the operating system to
+allocate more memory. Different types of Lisp objects, such as
+symbols, cons cells, small vectors, markers, etc., are segregated in
+distinct blocks in memory. (Large vectors, long strings, buffers and
+certain other editing types, which are fairly large, are allocated in
+individual blocks, one per object; small strings are packed into blocks
+of 8k bytes, and small vectors are packed into blocks of 4k bytes).
+
+@cindex vector-like objects, storage
+@cindex storage of vector-like Lisp objects
+ Beyond the basic vector, a lot of objects like window, buffer, and
+frame are managed as if they were vectors. The corresponding C data
+structures include the @code{struct vectorlike_header} field whose
+@code{next} field points to the next object in the chain:
+@code{header.next.buffer} points to the next buffer (which could be
+a killed buffer), and @code{header.next.vector} points to the next
+vector in a free list. If a vector is small (smaller than or equal to
+@code{VBLOCK_BYTES_MAX} bytes, see @file{alloc.c}), then
+@code{header.next.nbytes} contains the vector size in bytes.
- The garbage collector operates by finding and marking all Lisp objects
-that are still accessible to Lisp programs. To begin with, it assumes
-all the symbols, their values and associated function definitions, and
-any data presently on the stack, are accessible. Any objects that can
-be reached indirectly through other accessible objects are also
-accessible.
+@cindex garbage collection
+ It is quite common to use some storage for a while, then release it
+by (for example) killing a buffer or deleting the last pointer to an
+object. Emacs provides a @dfn{garbage collector} to reclaim this
+abandoned storage. The garbage collector operates by finding and
+marking all Lisp objects that are still accessible to Lisp programs.
+To begin with, it assumes all the symbols, their values and associated
+function definitions, and any data presently on the stack, are
+accessible. Any objects that can be reached indirectly through other
+accessible objects are also accessible.
When marking is finished, all objects still unmarked are garbage. No
matter what the Lisp program or the user does, it is impossible to refer
The sweep phase puts unused cons cells onto a @dfn{free list}
for future allocation; likewise for symbols and markers. It compacts
the accessible strings so they occupy fewer 8k blocks; then it frees the
-other 8k blocks. Vectors, buffers, windows, and other large objects are
-individually allocated and freed using @code{malloc} and @code{free}.
+other 8k blocks. Unreachable vectors from vector blocks are coalesced
+to create largest possible free areas; if a free area spans a complete
+4k block, that block is freed. Otherwise, the free area is recorded
+in a free list array, where each entry corresponds to a free list
+of areas of the same size. Large vectors, buffers, and other large
+objects are allocated and freed individually.
@cindex CL note---allocate more storage
@quotation
The total number of elements of existing vectors.
@item used-floats
-@c Emacs 19 feature
The number of floats in use.
@item free-floats
-@c Emacs 19 feature
The number of floats for which space has been obtained from the
operating system, but that are not currently being used.
If there was overflow in pure space (@pxref{Pure Storage}),
@code{garbage-collect} returns @code{nil}, because a real garbage
-collection can not be done in this situation.
+collection cannot be done.
@end deffn
@defopt garbage-collection-messages
function @code{memory-limit} provides information on the total amount of
memory Emacs is currently using.
-@c Emacs 19 feature
@defun memory-limit
This function returns the address of the last byte Emacs has allocated,
divided by 1024. We divide the value by 1024 to make sure it fits in a
@end defun
@defvar memory-full
-This variable is @code{t} if Emacs is close to out of memory for Lisp
+This variable is @code{t} if Emacs is nearly out of memory for Lisp
objects, and @code{nil} otherwise.
@end defvar
@defvar string-chars-consed
The total number of string characters that have been allocated so far
-in this Emacs session.
+in this session.
@end defvar
@defvar misc-objects-consed
The total number of miscellaneous objects that have been allocated so
-far in this Emacs session. These include markers and overlays, plus
+far in this session. These include markers and overlays, plus
certain objects not visible to users.
@end defvar
@cindex primitive function internals
@cindex writing Emacs primitives
- Lisp primitives are Lisp functions implemented in C. The details of
+ Lisp primitives are Lisp functions implemented in C@. The details of
interfacing the C function so that Lisp can call it are handled by a few
C macros. The only way to really understand how to write new C code is
to read the source, but we can explain some things here.
@smallexample
@group
DEFUN ("or", For, Sor, 0, UNEVALLED, 0,
- doc: /* Eval args until one of them yields non-nil, then return that
-value. The remaining args are not evalled at all.
+ doc: /* Eval args until one of them yields non-nil, then return
+that value.
+The remaining args are not evalled at all.
If all args return nil, return nil.
@end group
@group
the example above, it is @code{or}.
@item fname
-This is the C function name for this function. This is
-the name that is used in C code for calling the function. The name is,
-by convention, @samp{F} prepended to the Lisp name, with all dashes
-(@samp{-}) in the Lisp name changed to underscores. Thus, to call this
-function from C code, call @code{For}. Remember that the arguments must
-be of type @code{Lisp_Object}; various macros and functions for creating
-values of type @code{Lisp_Object} are declared in the file
-@file{lisp.h}.
+This is the C function name for this function. This is the name that
+is used in C code for calling the function. The name is, by
+convention, @samp{F} prepended to the Lisp name, with all dashes
+(@samp{-}) in the Lisp name changed to underscores. Thus, to call
+this function from C code, call @code{For}.
@item sname
This is a C variable name to use for a structure that holds the data for
indicating a special form that receives unevaluated arguments, or
@code{MANY}, indicating an unlimited number of evaluated arguments (the
equivalent of @code{&rest}). Both @code{UNEVALLED} and @code{MANY} are
-macros. If @var{max} is a number, it may not be less than @var{min} and
-it may not be greater than eight.
+macros. If @var{max} is a number, it must be more than @var{min} but
+less than 8.
@item interactive
This is an interactive specification, a string such as might be used as
@end table
After the call to the @code{DEFUN} macro, you must write the
-argument list that every C function must have, including the types for
-the arguments. For a function with a fixed maximum number of
-arguments, declare a C argument for each Lisp argument, and give them
-all type @code{Lisp_Object}. When a Lisp function has no upper limit
-on the number of arguments, its implementation in C actually receives
-exactly two arguments: the first is the number of Lisp arguments, and
-the second is the address of a block containing their values. They
-have types @code{int} and @w{@code{Lisp_Object *}}.
+argument list for the C function, including the types for the
+arguments. If the primitive accepts a fixed maximum number of Lisp
+arguments, there must be one C argument for each Lisp argument, and
+each argument must be of type @code{Lisp_Object}. (Various macros and
+functions for creating values of type @code{Lisp_Object} are declared
+in the file @file{lisp.h}.) If the primitive has no upper limit on
+the number of Lisp arguments, it must have exactly two C arguments:
+the first is the number of Lisp arguments, and the second is the
+address of a block containing their values. These have types
+@code{int} and @w{@code{Lisp_Object *}} respectively.
@cindex @code{GCPRO} and @code{UNGCPRO}
@cindex protect C variables from garbage collection
Within the function @code{For} itself, note the use of the macros
-@code{GCPRO1} and @code{UNGCPRO}. @code{GCPRO1} is used to
-``protect'' a variable from garbage collection---to inform the garbage
-collector that it must look in that variable and regard its contents
-as an accessible object. GC protection is necessary whenever you call
-@code{eval_sub} (or @code{Feval}) either directly or indirectly.
-At such a time, any Lisp object that this function may refer to again
-must be protected somehow.
+@code{GCPRO1} and @code{UNGCPRO}. These macros are defined for the
+sake of the few platforms which do not use Emacs' default
+stack-marking garbage collector. The @code{GCPRO1} macro ``protects''
+a variable from garbage collection, explicitly informing the garbage
+collector that that variable and all its contents must be as
+accessible. GC protection is necessary in any function which can
+perform Lisp evaluation by calling @code{eval_sub} or @code{Feval} as
+a subroutine, either directly or indirectly.
It suffices to ensure that at least one pointer to each object is
-GC-protected; that way, the object cannot be recycled, so all pointers
-to it remain valid. Thus, a particular local variable can do without
+GC-protected. Thus, a particular local variable can do without
protection if it is certain that the object it points to will be
preserved by some other pointer (such as another local variable that
-has a @code{GCPRO}).
-@ignore
-@footnote{Formerly, strings were a special exception; in older Emacs
-versions, every local variable that might point to a string needed a
-@code{GCPRO}.}.
-@end ignore
-Otherwise, the local variable needs a @code{GCPRO}.
+has a @code{GCPRO}). Otherwise, the local variable needs a
+@code{GCPRO}.
The macro @code{GCPRO1} protects just one local variable. If you
want to protect two variables, use @code{GCPRO2} instead; repeating
implicitly use local variables such as @code{gcpro1}; you must declare
these explicitly, with type @code{struct gcpro}. Thus, if you use
@code{GCPRO2}, you must declare @code{gcpro1} and @code{gcpro2}.
-Alas, we can't explain all the tricky details here.
@code{UNGCPRO} cancels the protection of the variables that are
protected in the current function. It is necessary to do this
explicitly.
- Built-in functions that take a variable number of arguments actually
-accept two arguments at the C level: the number of Lisp arguments, and
-a @code{Lisp_Object *} pointer to a C vector containing those Lisp
-arguments. This C vector may be part of a Lisp vector, but it need
-not be. The responsibility for using @code{GCPRO} to protect the Lisp
-arguments from GC if necessary rests with the caller in this case,
-since the caller allocated or found the storage for them.
-
You must not use C initializers for static or global variables unless
the variables are never written once Emacs is dumped. These variables
with initializers are allocated in an area of memory that becomes
read-only (on certain operating systems) as a result of dumping Emacs.
@xref{Pure Storage}.
-@c FIXME is this still true? I don't think so...
- Do not use static variables within functions---place all static
-variables at top level in the file. This is necessary because Emacs on
-some operating systems defines the keyword @code{static} as a null
-macro. (This definition is used because those systems put all variables
-declared static in a place that becomes read-only after dumping, whether
-they have initializers or not.)
-
@cindex @code{defsubr}, Lisp symbol for a primitive
Defining the C function is not enough to make a Lisp primitive
available; you must also create the Lisp symbol for the primitive and
@end smallexample
Note that C code cannot call functions by name unless they are defined
-in C. The way to call a function written in Lisp is to use
+in C@. The way to call a function written in Lisp is to use
@code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since
the Lisp function @code{funcall} accepts an unlimited number of
arguments, in C it takes two: the number of Lisp-level arguments, and a
@cindex buffer internals
Two structures (see @file{buffer.h}) are used to represent buffers
-in C. The @code{buffer_text} structure contains fields describing the
+in C@. The @code{buffer_text} structure contains fields describing the
text of a buffer; the @code{buffer} structure holds other fields. In
the case of indirect buffers, two or more @code{buffer} structures
reference the same @code{buffer_text} structure.
respectively. @code{hchild} is used if the window is subdivided
horizontally by child windows, and @code{vchild} if it is subdivided
vertically. In a live window, only one of @code{hchild}, @code{vchild},
-and @code{buffer} (q.v.) is non-@code{nil}.
+and @code{buffer} (q.v.@:) is non-@code{nil}.
@item next
@itemx prev
message in the process buffer.
@item pty_flag
-Non-@code{nil} if communication with the subprocess uses a @acronym{PTY};
+Non-@code{nil} if communication with the subprocess uses a pty;
@code{nil} if it uses a pipe.
@item infd