#
[bpt/emacs.git] / etc / AIX.DUMP
1 The following text was written by someone at IBM to describe an older
2 version of the code for dumping on AIX. It does NOT apply to
3 the current version of Emacs. It is included in case someone
4 is curious.
5
6
7 I (rms) couldn't understand the code, and I can't fully understand
8 this text either. I rewrote the code to use the same basic
9 principles, as far as I understood them, but more cleanly. This
10 rewritten code does not always work. In fact, the basic method
11 seems to be intrinsically flawed.
12
13 Since then, someone else implemented a different way of dumping on
14 the RS/6000, which does seem to work. None of the following
15 applies to the way Emacs now dumps on the 6000. However, the
16 current method fails to use shared libraries. Anyone who might be
17 interested in trying to resurrect the previous method might still
18 find the following information useful.
19
20
21 It seems that the IBM dumping code was simply set up to detect when
22 the dumped data cannot be used, and in that case to act approximately
23 as if CANNOT_DUMP had been defined all along. (This is buried in
24 paragraph 1.) It seems simpler just to define CANNOT_DUMP, since
25 Emacs is not set up to decide at run time whether there is dumping or
26 not, and doing so correctly would be a lot of work.
27
28 Note that much of the other information, such as the name and format
29 of the dumped data file, has been changed.
30
31
32 --rms
33
34
35
36 A different approach has been taken to implement the
37 "dump/load" feature of GNU Emacs for AIX 3.1. Traditionally the
38 unexec function creates a new a.out executable file which contains
39 preloaded Lisp code. Executing the new a.out file (normally called
40 xemacs) provides rapid startup since the standard suite of Lisp code
41 is preloaded as part of the executable file.
42
43 AIX 3.1 architecture precludes the use of this technique
44 because the dynamic loader cannot guarantee a fixed starting location
45 for the process data section. The loader loads all shared library
46 data BEFORE process data. When a shared library changes its data
47 space, the process initial data section address (_data) will change
48 and all global process variables are automatically relocated to new
49 addresses. This invalidates the "dumped" Emacs executable which has
50 data addresses which are not relocatable and now corrupt. Emacs would
51 fail to execute until rebuilt with the new libraries.
52
53 To circumvent the dynamic loader feature of AIX 3.1, the dump process
54 has been modified as follows:
55
56 1) A new executable file is NOT created. Instead, both pure and
57 impure data are saved by the dump function and automatically
58 reloaded during process initialization. If any of the saved data
59 is unavailable or invalid, loadup.el will be automatically loaded.
60
61 2) Pure data is defined as a shared memory segment and attached
62 automatically as read-only data during initialization. This
63 allows the pure data to be a shared resource among all Emacs
64 processes. The shared memory segment size is PURESIZE bytes.
65 If the shared memory segment is unavailable or invalid, a new
66 shared memory segment is created and the impure data save file
67 is destroyed, forcing loadup.el to be reloaded.
68
69 3) The ipc key used to create and access Emacs shared memory is
70 SHMKEY and can be overridden by the environment symbol EMACSSHMKEY.
71 Only one ipc key is allowed per system. The environment symbol
72 is provided in case the default ipc key has already been used.
73
74 4) Impure data is written to the ../bin/.emacs.data file by the
75 dump function. This file contains the process' impure data
76 at the moment of load completion. During Emacs initialization,
77 the process' data section is expanded and overwritten
78 with the .emacs.data file contents.
79
80 The following are software notes concerning the GNU Emacs dump function under AIX 3.1:
81
82 1) All of the new dump/load code is activated by the #ifdef SHMKEY
83 conditional.
84
85 2) The automatic loading of loadup.el does NOT cause the dump function
86 to be performed. Therefore once the pure/impure data is discarded,
87 someone must remake Emacs to create the saved data files. This
88 should only be necessary when Emacs is first installed or whenever
89 AIX is upgraded.
90
91 3) Emacs will exit with an error if executed in a non-X environment
92 and the dump function was performed within a X window. Therefore
93 the dump function should always be performed in a non-X
94 environment unless the X environment will ALWAYS be available.
95
96 4) Emacs only maintains the lower 24 bits of any data address. The
97 remaining upper 8 bits are reset by the XPNTR macro whenever any
98 Lisp object is referenced. This poses a serious problem because
99 pure data is stored in segment 3 (shared memory) and impure data
100 is stored in segment 2 (data). To reset the upper 8 address bits
101 correctly, XPNTR must guess as to which type of data is represented
102 by the lower 24 address bits. The technique chosen is based upon
103 the fact that pure data offsets in segment 3 range from
104 0 -> PURESIZE-1, which are relatively small offsets. Impure data
105 offsets in segment 2 are relatively large (> 0x40000) because they
106 must follow all shared library data. Therefore XPNTR adds segment
107 3 to each data offset which is small (below PURESIZE) and adds
108 segment 2 to all other offsets. This algorithm will remain valid
109 as long as a) pure data size remains relatively small and b) process
110 data is loaded after shared library data.
111
112 To eliminate this guessing game, Emacs must preserve the 32-bit
113 address and add additional data object overhead for the object type
114 and garbage collection mark bit.
115
116 5) The data section written to .emacs.data is divided into three
117 areas as shown below. The file header contains four character
118 pointers which are used during automatic data loading. The file's
119 contents will only be used if the first three addresses match
120 their counterparts in the current process. The fourth address is
121 the new data segment address required to hold all of the preloaded
122 data.
123
124
125 .emacs.data file format
126
127 +---------------------------------------+ \
128 | address of _data | \
129 +---------------------------------------+ \
130 | address of _end | \
131 +---------------------------------------+ file header
132 | address of initial sbrk(0) | /
133 +---------------------------------------+ /
134 | address of final sbrk(0) | /
135 +---------------------------------------+ /
136 \ \
137 \ \
138 all data to be loaded from
139 _data to _end
140 \ \
141 \ \
142 +---------------------------------------+
143 \ \
144 \ \
145 all data to be loaded from
146 initial to final sbrk(0)
147 \ \
148 +---------------------------------------+
149
150
151 Sections two and three contain the preloaded data which is
152 restored at locations _data and initial sbrk(0) respectively.
153
154 The reason two separate sections are needed is that process
155 initialization allocates data (via malloc) prior to main()
156 being called. Therefore _end is several kbytes lower than
157 the address returned by an initial sbrk(0). This creates a
158 hole in the process data space and malloc will abort if this
159 region is overwritten during the load function.
160
161 One further complication with the malloc'd space is that it
162 is partially empty and must be "consumed" so that data space
163 malloc'd in the future is not assigned to this region. The malloc
164 function distributed with Emacs anticipates this problem but the
165 AIX 3.1 version does not. Therefore, repeated malloc calls are
166 needed to exhaust this initial malloc space. How do you know
167 when malloc has exhausted its free memory? You don't! So the
168 code must repeatedly call malloc for each buffer size and
169 detect when a new memory page has been allocated. Once the new
170 memory page is allocated, you can calculate the number of free
171 buffers in that page and request exactly that many more. Future
172 malloc requests will now be added at the top of a new memory page.
173
174 One final point - the initial sbrk(0) is the value of sbrk(0)
175 after all of the above malloc hacking has been performed.
176
177
178 The following Emacs dump/load issues need to be addressed:
179
180 1) Loadup.el exits with an error message because the xemacs and
181 emacs-xxx files are not created during the dump function.
182
183 Loadup.el should be changed to check for the new .emacs.data
184 file.
185
186 2) Dump will only support one .emacs.data file for the entire
187 system. This precludes the ability to allow each user to
188 define his/her own "dumped" Emacs.
189
190 Add an environment symbol to override the default .emacs.data
191 path.
192
193 3) An error message "error in init file" is displayed out of
194 startup.el when the dumped Emacs is invoked by a non-root user.
195 Although all of the preloaded Lisp code is present, the important
196 purify-flag has not been set back to Qnil - precluding the
197 loading of any further Lisp code until the flag is manually
198 reset.
199
200 The problem appears to be an access violation which will go
201 away if the read-write access modes to all of the files are
202 changed to rw-.
203
204 4) In general, all file access modes should be changed from
205 rw-r--r-- to rw-rw-rw-. They are currently setup to match
206 standard AIX access modes.
207
208 5) The dump function is not invoked when the automatic load of
209 loadup.el is performed.
210
211 Perhaps the command arguments array should be expanded with
212 "dump" added to force an automatic dump.
213
214 6) The automatic initialization function alloc_shm will delete
215 the shared memory segment and .emacs.data file if the "dump"
216 command argument is found in ANY argument position. The
217 dump function will only take place in loadup.el if "dump"
218 is the third or fourth command argument.
219
220 Change alloc_shm to live by loadup.el rules.
221