Commit | Line | Data |
---|---|---|
a933dad1 DL |
1 | The following text was written by someone at IBM to describe an older |
2 | version of the code for dumping on AIX. It does NOT apply to | |
3 | the current version of Emacs. It is included in case someone | |
4 | is curious. | |
5 | ||
6 | ||
7 | I (rms) couldn't understand the code, and I can't fully understand | |
8 | this text either. I rewrote the code to use the same basic | |
9 | principles, as far as I understood them, but more cleanly. This | |
10 | rewritten code does not always work. In fact, the basic method | |
11 | seems to be intrinsically flawed. | |
12 | ||
13 | Since then, someone else implemented a different way of dumping on | |
14 | the RS/6000, which does seem to work. None of the following | |
15 | applies to the way Emacs now dumps on the 6000. However, the | |
16 | current method fails to use shared libraries. Anyone who might be | |
17 | interested in trying to resurrect the previous method might still | |
18 | find the following information useful. | |
19 | ||
20 | ||
21 | It seems that the IBM dumping code was simply set up to detect when | |
22 | the dumped data cannot be used, and in that case to act approximately | |
23 | as if CANNOT_DUMP had been defined all along. (This is buried in | |
24 | paragraph 1.) It seems simpler just to define CANNOT_DUMP, since | |
25 | Emacs is not set up to decide at run time whether there is dumping or | |
26 | not, and doing so correctly would be a lot of work. | |
27 | ||
28 | Note that much of the other information, such as the name and format | |
29 | of the dumped data file, has been changed. | |
30 | ||
31 | ||
32 | --rms | |
33 | ||
34 | ||
35 | ||
36 | A different approach has been taken to implement the | |
37 | "dump/load" feature of GNU Emacs for AIX 3.1. Traditionally the | |
38 | unexec function creates a new a.out executable file which contains | |
39 | preloaded Lisp code. Executing the new a.out file (normally called | |
40 | xemacs) provides rapid startup since the standard suite of Lisp code | |
41 | is preloaded as part of the executable file. | |
42 | ||
43 | AIX 3.1 architecture precludes the use of this technique | |
44 | because the dynamic loader cannot guarantee a fixed starting location | |
45 | for the process data section. The loader loads all shared library | |
46 | data BEFORE process data. When a shared library changes its data | |
47 | space, the process initial data section address (_data) will change | |
48 | and all global process variables are automatically relocated to new | |
49 | addresses. This invalidates the "dumped" Emacs executable which has | |
50 | data addresses which are not relocatable and now corrupt. Emacs would | |
51 | fail to execute until rebuilt with the new libraries. | |
52 | ||
53 | To circumvent the dynamic loader feature of AIX 3.1, the dump process | |
54 | has been modified as follows: | |
55 | ||
56 | 1) A new executable file is NOT created. Instead, both pure and | |
57 | impure data are saved by the dump function and automatically | |
58 | reloaded during process initialization. If any of the saved data | |
59 | is unavailable or invalid, loadup.el will be automatically loaded. | |
60 | ||
61 | 2) Pure data is defined as a shared memory segment and attached | |
62 | automatically as read-only data during initialization. This | |
63 | allows the pure data to be a shared resource among all Emacs | |
64 | processes. The shared memory segment size is PURESIZE bytes. | |
65 | If the shared memory segment is unavailable or invalid, a new | |
66 | shared memory segment is created and the impure data save file | |
67 | is destroyed, forcing loadup.el to be reloaded. | |
68 | ||
69 | 3) The ipc key used to create and access Emacs shared memory is | |
70 | SHMKEY and can be overridden by the environment symbol EMACSSHMKEY. | |
71 | Only one ipc key is allowed per system. The environment symbol | |
72 | is provided in case the default ipc key has already been used. | |
73 | ||
74 | 4) Impure data is written to the ../bin/.emacs.data file by the | |
75 | dump function. This file contains the process' impure data | |
76 | at the moment of load completion. During Emacs initialization, | |
77 | the process' data section is expanded and overwritten | |
78 | with the .emacs.data file contents. | |
79 | ||
80 | The following are software notes concerning the GNU Emacs dump function under AIX 3.1: | |
81 | ||
82 | 1) All of the new dump/load code is activated by the #ifdef SHMKEY | |
83 | conditional. | |
84 | ||
85 | 2) The automatic loading of loadup.el does NOT cause the dump function | |
86 | to be performed. Therefore once the pure/impure data is discarded, | |
87 | someone must remake Emacs to create the saved data files. This | |
88 | should only be necessary when Emacs is first installed or whenever | |
89 | AIX is upgraded. | |
90 | ||
91 | 3) Emacs will exit with an error if executed in a non-X environment | |
92 | and the dump function was performed within a X window. Therefore | |
93 | the dump function should always be performed in a non-X | |
94 | environment unless the X environment will ALWAYS be available. | |
95 | ||
96 | 4) Emacs only maintains the lower 24 bits of any data address. The | |
97 | remaining upper 8 bits are reset by the XPNTR macro whenever any | |
98 | Lisp object is referenced. This poses a serious problem because | |
99 | pure data is stored in segment 3 (shared memory) and impure data | |
100 | is stored in segment 2 (data). To reset the upper 8 address bits | |
101 | correctly, XPNTR must guess as to which type of data is represented | |
102 | by the lower 24 address bits. The technique chosen is based upon | |
103 | the fact that pure data offsets in segment 3 range from | |
104 | 0 -> PURESIZE-1, which are relatively small offsets. Impure data | |
105 | offsets in segment 2 are relatively large (> 0x40000) because they | |
106 | must follow all shared library data. Therefore XPNTR adds segment | |
107 | 3 to each data offset which is small (below PURESIZE) and adds | |
108 | segment 2 to all other offsets. This algorithm will remain valid | |
109 | as long as a) pure data size remains relatively small and b) process | |
110 | data is loaded after shared library data. | |
111 | ||
112 | To eliminate this guessing game, Emacs must preserve the 32-bit | |
113 | address and add additional data object overhead for the object type | |
114 | and garbage collection mark bit. | |
115 | ||
116 | 5) The data section written to .emacs.data is divided into three | |
117 | areas as shown below. The file header contains four character | |
118 | pointers which are used during automatic data loading. The file's | |
119 | contents will only be used if the first three addresses match | |
120 | their counterparts in the current process. The fourth address is | |
121 | the new data segment address required to hold all of the preloaded | |
122 | data. | |
123 | ||
124 | ||
125 | .emacs.data file format | |
126 | ||
127 | +---------------------------------------+ \ | |
128 | | address of _data | \ | |
129 | +---------------------------------------+ \ | |
130 | | address of _end | \ | |
131 | +---------------------------------------+ file header | |
132 | | address of initial sbrk(0) | / | |
133 | +---------------------------------------+ / | |
134 | | address of final sbrk(0) | / | |
135 | +---------------------------------------+ / | |
136 | \ \ | |
137 | \ \ | |
138 | all data to be loaded from | |
139 | _data to _end | |
140 | \ \ | |
141 | \ \ | |
142 | +---------------------------------------+ | |
143 | \ \ | |
144 | \ \ | |
145 | all data to be loaded from | |
146 | initial to final sbrk(0) | |
147 | \ \ | |
148 | +---------------------------------------+ | |
149 | ||
150 | ||
151 | Sections two and three contain the preloaded data which is | |
152 | restored at locations _data and initial sbrk(0) respectively. | |
153 | ||
154 | The reason two separate sections are needed is that process | |
155 | initialization allocates data (via malloc) prior to main() | |
156 | being called. Therefore _end is several kbytes lower than | |
157 | the address returned by an initial sbrk(0). This creates a | |
158 | hole in the process data space and malloc will abort if this | |
159 | region is overwritten during the load function. | |
160 | ||
161 | One further complication with the malloc'd space is that it | |
162 | is partially empty and must be "consumed" so that data space | |
163 | malloc'd in the future is not assigned to this region. The malloc | |
164 | function distributed with Emacs anticipates this problem but the | |
165 | AIX 3.1 version does not. Therefore, repeated malloc calls are | |
166 | needed to exhaust this initial malloc space. How do you know | |
167 | when malloc has exhausted its free memory? You don't! So the | |
168 | code must repeatedly call malloc for each buffer size and | |
169 | detect when a new memory page has been allocated. Once the new | |
170 | memory page is allocated, you can calculate the number of free | |
171 | buffers in that page and request exactly that many more. Future | |
172 | malloc requests will now be added at the top of a new memory page. | |
173 | ||
174 | One final point - the initial sbrk(0) is the value of sbrk(0) | |
175 | after all of the above malloc hacking has been performed. | |
176 | ||
177 | ||
178 | The following Emacs dump/load issues need to be addressed: | |
179 | ||
180 | 1) Loadup.el exits with an error message because the xemacs and | |
181 | emacs-xxx files are not created during the dump function. | |
182 | ||
183 | Loadup.el should be changed to check for the new .emacs.data | |
184 | file. | |
185 | ||
186 | 2) Dump will only support one .emacs.data file for the entire | |
187 | system. This precludes the ability to allow each user to | |
188 | define his/her own "dumped" Emacs. | |
189 | ||
190 | Add an environment symbol to override the default .emacs.data | |
191 | path. | |
192 | ||
193 | 3) An error message "error in init file" is displayed out of | |
194 | startup.el when the dumped Emacs is invoked by a non-root user. | |
195 | Although all of the preloaded Lisp code is present, the important | |
196 | purify-flag has not been set back to Qnil - precluding the | |
197 | loading of any further Lisp code until the flag is manually | |
198 | reset. | |
199 | ||
200 | The problem appears to be an access violation which will go | |
201 | away if the read-write access modes to all of the files are | |
202 | changed to rw-. | |
203 | ||
204 | 4) In general, all file access modes should be changed from | |
205 | rw-r--r-- to rw-rw-rw-. They are currently setup to match | |
206 | standard AIX access modes. | |
207 | ||
208 | 5) The dump function is not invoked when the automatic load of | |
209 | loadup.el is performed. | |
210 | ||
211 | Perhaps the command arguments array should be expanded with | |
212 | "dump" added to force an automatic dump. | |
213 | ||
214 | 6) The automatic initialization function alloc_shm will delete | |
215 | the shared memory segment and .emacs.data file if the "dump" | |
216 | command argument is found in ANY argument position. The | |
217 | dump function will only take place in loadup.el if "dump" | |
218 | is the third or fourth command argument. | |
219 | ||
220 | Change alloc_shm to live by loadup.el rules. | |
221 |