@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
-@c Copyright (C) 1990-1993, 1998-1999, 2001-2012 Free Software Foundation, Inc.
+@c Copyright (C) 1990-1993, 1998-1999, 2001-2014 Free Software
+@c Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@node GNU Emacs Internals
@appendix GNU Emacs Internals
* Pure Storage:: Kludge to make preloaded Lisp functions shareable.
* Garbage Collection:: Reclaiming space for Lisp objects no longer used.
* Memory Usage:: Info about total size of Lisp objects made so far.
+* C Dialect:: What C variant Emacs is written in.
* Writing Emacs Primitives:: Writing C code for Emacs.
* Object Internals:: Data formats of buffers, windows, processes.
+* C Integer Types:: How C integer types are used inside Emacs.
@end menu
@node Building Emacs
time.)
@end itemize
+@cindex change @code{load-path} at configure time
+@cindex @option{--enable-locallisppath} option to @command{configure}
It is not advisable to put anything in @file{site-load.el} or
@file{site-init.el} that would alter any of the features that users
expect in an ordinary unmodified Emacs. If you feel you must override
normal features for your site, do it with @file{default.el}, so that
users can override your changes if they wish. @xref{Startup Summary}.
+Note that if either @file{site-load.el} or @file{site-init.el} changes
+@code{load-path}, the changes will be lost after dumping.
+@xref{Library Search}. To make a permanent change to
+@code{load-path}, use the @option{--enable-locallisppath} option
+of @command{configure}.
In a package that can be preloaded, it is sometimes necessary (or
useful) to delay certain evaluations until Emacs subsequently starts
future allocations. So an overall result is:
@example
-((@code{conses} @var{cons-size} @var{used-conse} @var{free-conses})
+((@code{conses} @var{cons-size} @var{used-conses} @var{free-conses})
(@code{symbols} @var{symbol-size} @var{used-symbols} @var{free-symbols})
(@code{miscs} @var{misc-size} @var{used-miscs} @var{free-miscs})
(@code{strings} @var{string-size} @var{used-strings} @var{free-strings})
@table @var
@item cons-size
-Internal size of a cons cell, i.e.@: @code{sizeof (struct Lisp_Cons)}.
+Internal size of a cons cell, i.e., @code{sizeof (struct Lisp_Cons)}.
@item used-conses
The number of cons cells in use.
the operating system, but that are not currently being used.
@item symbol-size
-Internal size of a symbol, i.e.@: @code{sizeof (struct Lisp_Symbol)}.
+Internal size of a symbol, i.e., @code{sizeof (struct Lisp_Symbol)}.
@item used-symbols
The number of symbols in use.
the operating system, but that are not currently being used.
@item misc-size
-Internal size of a miscellaneous entity, i.e.@:
+Internal size of a miscellaneous entity, i.e.,
@code{sizeof (union Lisp_Misc)}, which is a size of the
largest type enumerated in @code{enum Lisp_Misc_Type}.
from the operating system, but that are not currently being used.
@item string-size
-Internal size of a string header, i.e.@: @code{sizeof (struct Lisp_String)}.
+Internal size of a string header, i.e., @code{sizeof (struct Lisp_String)}.
@item used-strings
The number of string headers in use.
The total size of all string data in bytes.
@item vector-size
-Internal size of a vector header, i.e.@: @code{sizeof (struct Lisp_Vector)}.
+Internal size of a vector header, i.e., @code{sizeof (struct Lisp_Vector)}.
@item used-vectors
The number of vector headers allocated from the vector blocks.
The number of free slots in all vector blocks.
@item float-size
-Internal size of a float object, i.e.@: @code{sizeof (struct Lisp_Float)}.
+Internal size of a float object, i.e., @code{sizeof (struct Lisp_Float)}.
(Do not confuse it with the native platform @code{float} or @code{double}.)
@item used-floats
the operating system, but that are not currently being used.
@item interval-size
-Internal size of an interval object, i.e.@: @code{sizeof (struct interval)}.
+Internal size of an interval object, i.e., @code{sizeof (struct interval)}.
@item used-intervals
The number of intervals in use.
the operating system, but that are not currently being used.
@item buffer-size
-Internal size of a buffer, i.e.@: @code{sizeof (struct buffer)}.
+Internal size of a buffer, i.e., @code{sizeof (struct buffer)}.
(Do not confuse with the value returned by @code{buffer-size} function.)
@item used-buffers
The number of buffer objects in use. This includes killed buffers
-invisible to users, i.e.@: all buffers in @code{all_buffers} list.
+invisible to users, i.e., all buffers in @code{all_buffers} list.
@item unit-size
The unit of heap space measurement, always equal to 1024 bytes.
@defvar gc-elapsed
This variable contains the total number of seconds of elapsed time
-during garbage collection so far in this Emacs session, as a floating
-point number.
+during garbage collection so far in this Emacs session, as a
+floating-point number.
@end defvar
@node Memory Usage
Emacs session.
@end defvar
+@node C Dialect
+@section C Dialect
+@cindex C programming language
+
+The C part of Emacs is portable to C99 or later: C11-specific features such
+as @samp{<stdalign.h>} and @samp{_Noreturn} are not used without a check,
+typically at configuration time, and the Emacs build procedure
+provides a substitute implementation if necessary. Some C11 features,
+such as anonymous structures and unions, are too difficult to emulate,
+so they are avoided entirely.
+
+At some point in the future the base C dialect will no doubt change to C11.
+
@node Writing Emacs Primitives
@section Writing Emacs Primitives
@cindex primitive function internals
@cindex writing Emacs primitives
- Lisp primitives are Lisp functions implemented in C. The details of
+ Lisp primitives are Lisp functions implemented in C@. The details of
interfacing the C function so that Lisp can call it are handled by a few
C macros. The only way to really understand how to write new C code is
to read the source, but we can explain some things here.
macros. If @var{max} is a number, it must be more than @var{min} but
less than 8.
+@cindex interactive specification in primitives
@item interactive
-This is an interactive specification, a string such as might be used as
-the argument of @code{interactive} in a Lisp function. In the case of
-@code{or}, it is 0 (a null pointer), indicating that @code{or} cannot be
-called interactively. A value of @code{""} indicates a function that
-should receive no arguments when called interactively. If the value
-begins with a @samp{(}, the string is evaluated as a Lisp form.
-For examples of the last two forms, see @code{widen} and
-@code{narrow-to-region} in @file{editfns.c}.
+This is an interactive specification, a string such as might be used
+as the argument of @code{interactive} in a Lisp function. In the case
+of @code{or}, it is 0 (a null pointer), indicating that @code{or}
+cannot be called interactively. A value of @code{""} indicates a
+function that should receive no arguments when called interactively.
+If the value begins with a @samp{"(}, the string is evaluated as a
+Lisp form. For example:
+
+@example
+@group
+DEFUN ("foo", Ffoo, Sfoo, 0, UNEVALLED,
+ "(list (read-char-by-name \"Insert character: \")\
+ (prefix-numeric-value current-prefix-arg)\
+ t))",
+ doc: /* @dots{} /*)
+@end group
+@end example
@item doc
This is the documentation string. It uses C comment syntax rather
the number of Lisp arguments, it must have exactly two C arguments:
the first is the number of Lisp arguments, and the second is the
address of a block containing their values. These have types
-@code{int} and @w{@code{Lisp_Object *}} respectively. Since
+@code{int} and @w{@code{Lisp_Object *}} respectively. Since
@code{Lisp_Object} can hold any Lisp object of any data type, you
can determine the actual data type only at run time; so if you want
a primitive to accept only a certain type of argument, you must check
@end smallexample
Note that C code cannot call functions by name unless they are defined
-in C. The way to call a function written in Lisp is to use
+in C@. The way to call a function written in Lisp is to use
@code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since
the Lisp function @code{funcall} accepts an unlimited number of
arguments, in C it takes two: the number of Lisp-level arguments, and a
@cindex object internals
Emacs Lisp provides a rich set of the data types. Some of them, like cons
-cells, integers and stirngs, are common to nearly all Lisp dialects. Some
+cells, integers and strings, are common to nearly all Lisp dialects. Some
others, like markers and buffers, are quite special and needed to provide
the basic support to write editor commands in Lisp. To implement such
a variety of object types and provide an efficient way to pass objects between
vectorlike or miscellaneous object. Each of these data types has the
corresponding tag value. All tags are enumerated by @code{enum Lisp_Type}
and placed into a 3-bit bitfield of the @code{Lisp_Object}. The rest of the
-bits is the value itself. Integer values are immediate, i.e.@: directly
+bits is the value itself. Integers are immediate, i.e., directly
represented by those @dfn{value bits}, and all other objects are represented
by the C pointers to a corresponding object allocated from the heap. Width
of the @code{Lisp_Object} is platform- and configuration-dependent: usually
-it's equal to the width of an underlying platform pointer (i.e.@: 32-bit on
+it's equal to the width of an underlying platform pointer (i.e., 32-bit on
a 32-bit machine and 64-bit on a 64-bit one), but also there is a special
configuration where @code{Lisp_Object} is 64-bit but all pointers are 32-bit.
The latter trick was designed to overcome the limited range of values for
Symbol, the unique-named entity commonly used as an identifier.
@item struct Lisp_Float
-Floating point value.
+Floating-point value.
@item union Lisp_Misc
Miscellaneous kinds of objects which don't fit into any of the above.
@cindex buffer internals
Two structures (see @file{buffer.h}) are used to represent buffers
-in C. The @code{buffer_text} structure contains fields describing the
+in C@. The @code{buffer_text} structure contains fields describing the
text of a buffer; the @code{buffer} structure holds other fields. In
the case of indirect buffers, two or more @code{buffer} structures
reference the same @code{buffer_text} structure.
no access to the parent windows; they operate on the windows at the
leaves of the tree, which actually display buffers.
+@c FIXME: These two slots and the `buffer' slot below were replaced
+@c with a single slot `contents' on 2013-03-28. --xfq
@item hchild
@itemx vchild
These fields contain the window's leftmost child and its topmost child
respectively. @code{hchild} is used if the window is subdivided
horizontally by child windows, and @code{vchild} if it is subdivided
vertically. In a live window, only one of @code{hchild}, @code{vchild},
-and @code{buffer} (q.v.) is non-@code{nil}.
+and @code{buffer} (q.v.@:) is non-@code{nil}.
@item next
@itemx prev
@code{nil} meaning none is known. If it is a buffer, don't display
the line number as long as the window shows that buffer.
-@item region_showing
-If the region (or part of it) is highlighted in this window, this field
-holds the mark position that made one end of that region. Otherwise,
-this field is @code{nil}.
-
@item column_number_displayed
The column number currently displayed in this window's mode line, or @code{nil}
if column numbers are not being displayed.
process is running or @code{t} if the process is stopped.
@item filter
-If non-@code{nil}, a function used to accept output from the process
-instead of a buffer.
+A function used to accept output from the process.
@item sentinel
-If non-@code{nil}, a function called whenever the state of the process
-changes.
+A function called whenever the state of the process changes.
@item buffer
The associated buffer of the process.
@end table
+@node C Integer Types
+@section C Integer Types
+@cindex integer types (C programming language)
+
+Here are some guidelines for use of integer types in the Emacs C
+source code. These guidelines sometimes give competing advice; common
+sense is advised.
+
+@itemize @bullet
+@item
+Avoid arbitrary limits. For example, avoid @code{int len = strlen
+(s);} unless the length of @code{s} is required for other reasons to
+fit in @code{int} range.
+
+@item
+Do not assume that signed integer arithmetic wraps around on overflow.
+This is no longer true of Emacs porting targets: signed integer
+overflow has undefined behavior in practice, and can dump core or
+even cause earlier or later code to behave ``illogically''. Unsigned
+overflow does wrap around reliably, modulo a power of two.
+
+@item
+Prefer signed types to unsigned, as code gets confusing when signed
+and unsigned types are combined. Many other guidelines assume that
+types are signed; in the rarer cases where unsigned types are needed,
+similar advice may apply to the unsigned counterparts (e.g.,
+@code{size_t} instead of @code{ptrdiff_t}, or @code{uintptr_t} instead
+of @code{intptr_t}).
+
+@item
+Prefer @code{int} for Emacs character codes, in the range 0 ..@: 0x3FFFFF.
+
+@item
+Prefer @code{ptrdiff_t} for sizes, i.e., for integers bounded by the
+maximum size of any individual C object or by the maximum number of
+elements in any C array. This is part of Emacs's general preference
+for signed types. Using @code{ptrdiff_t} limits objects to
+@code{PTRDIFF_MAX} bytes, but larger objects would cause trouble
+anyway since they would break pointer subtraction, so this does not
+impose an arbitrary limit.
+
+@item
+Prefer @code{intptr_t} for internal representations of pointers, or
+for integers bounded only by the number of objects that can exist at
+any given time or by the total number of bytes that can be allocated.
+Currently Emacs sometimes uses other types when @code{intptr_t} would
+be better; fixing this is lower priority, as the code works as-is on
+Emacs's current porting targets.
+
+@item
+Prefer the Emacs-defined type @code{EMACS_INT} for representing values
+converted to or from Emacs Lisp fixnums, as fixnum arithmetic is based
+on @code{EMACS_INT}.
+
+@item
+When representing a system value (such as a file size or a count of
+seconds since the Epoch), prefer the corresponding system type (e.g.,
+@code{off_t}, @code{time_t}). Do not assume that a system type is
+signed, unless this assumption is known to be safe. For example,
+although @code{off_t} is always signed, @code{time_t} need not be.
+
+@item
+Prefer the Emacs-defined type @code{printmax_t} for representing
+values that might be any signed integer that can be printed,
+using a @code{printf}-family function.
+
+@item
+Prefer @code{intmax_t} for representing values that might be any
+signed integer value.
+
+@item
+Prefer @code{bool}, @code{false} and @code{true} for booleans.
+Using @code{bool} can make programs easier to read and a bit faster than
+using @code{int}. Although it is also OK to use @code{int}, @code{0}
+and @code{1}, this older style is gradually being phased out. When
+using @code{bool}, respect the limitations of the replacement
+implementation of @code{bool}, as documented in the source file
+@file{lib/stdbool.in.h}, so that Emacs remains portable to pre-C99
+platforms. In particular, boolean bitfields should be of type
+@code{bool_bf}, not @code{bool}, so that they work correctly even when
+compiling Objective C with standard GCC.
+
+@item
+In bitfields, prefer @code{unsigned int} or @code{signed int} to
+@code{int}, as @code{int} is less portable: it might be signed, and
+might not be. Single-bit bit fields should be @code{unsigned int} or
+@code{bool_bf} so that their values are 0 or 1.
+@end itemize
+
@c FIXME Mention src/globals.h somewhere in this file?