Import Upstream version 4.92
[hcoop/debian/exim4.git] / doc / dbm.discuss.txt
CommitLineData
420a0d19
CE
1DBM Libraries for use with Exim
2-------------------------------
3
4Background
5----------
6
7Exim uses direct-access (so-called "dbm") files for a number of different
8purposes. These are files arranged so that the data they contain is indexed and
9can quickly be looked up by quoting an appropriate key. They are used as
10follows:
11
121. Exim keeps its "hints" databases in dbm files.
13
142. The configuration can specify that certain things (e.g. aliases) be looked
15 up in dbm files.
16
173. The configuration can contain expansion strings that involve lookups in dbm
18 files.
19
204. The filter commands "mail" and "vacation" have a facility for replying only
21 once to each incoming address. The record of which addresses have already
22 received replies may be kept in a dbm file, depending on the configuration
23 option once_file_size.
24
25The runtime configuration can be set up without specifying 2 or 3, but Exim
26always requires the availability of a dbm library, for 1 (and 4 if configured
27that way).
28
29
30DBM Libraries
31-------------
32
33The original library that provided the dbm facility in Unix was called "dbm".
34This seems to have been superseded quite some time ago by a new version called
35"ndbm" which permits several dbm files to be open at once. Several operating
36systems, including those from Sun, contain ndbm as standard.
37
38A number of alternative libraries also exist, the most common of which seems to
39be Berkeley DB (just called DB hereinafter). Release 1.85 was around for
40some time, and various releases 2.x began to appear towards the end of 1997. In
41November 1999, version 3.0 was released, and the ending of support for 2.7.7,
42the last 2.x release, was announced for November 2000. (Support for 1.85 has
43already ceased.) There were further 3.x releases, but by the end of 2001, the
44current release was 4.0.14.
45
46There are major differences in implementation and interface between the DB 1.x
47and 2.x/3.x/4.x releases, and they are best considered as two independent dbm
48libraries. Changes to the API were made for 3.0 and again for 3.1.
49
50Another DBM library is the GNU library, gdbm, though this does not seem to be
51very widespread.
52
53Yet another dbm library is tdb (Trivial Data Base) which has come out of the
54Samba project. The first releases seem to have been in mid-2000.
55
56Some older Linux releases contain gdbm as standard, while others contain no dbm
57library. More recent releases contain DB 1.85 or 2.x or later, and presumably
58will track the development of the DB library. Some BSD versions of Unix include
59DB 1.85 or later. All of the non-ndbm libraries except tdb contain
60compatibility interfaces so that programs written to call the ndbm functions
61should, in theory, work with them, but there are some potential pitfalls which
62have caught out Exim users in the past.
63
64Exim has been tested with ndbm, gdbm, DB 1.85, DB 2.x, DB 3.1, DB 4.0.14, and
65tdb 1.0.2, in various different modes in some cases, and is believed to work
66with all of them if it and they are properly configured.
67
68I have considered the possibility of calling different dbm libraries for
69different functions from a single Exim binary. However, because all bar one of
70the libraries provide ndbm compatibility interfaces (and therefore the same
71function names) it would require a lot of complicated, error-prone trickery to
72achieve this. Exim therefore uses a single library for all its dbm activities.
73
74However, Exim does also support cdb (Constant Data Base), an efficient file
75arrangement for indexed data that does not change incrementally (for example,
76alias files). This is independent of any dbm library and can be used alongside
77any of them.
78
79
80Locking
81-------
82
83The configuration option EXIMDB_LOCK_TIMEOUT controls how long Exim waits to
84get a lock on a hints database. From version 1.80 onwards, Exim does not
85attempt to take out a lock on an actual database file (this caused problems in
86the past). Instead, it takes out an fcntl() lock on a separate file whose name
87ends in ".lockfile". This ensures that Exim has exclusive access to the
88database before even attempting to open it. Exim creates the lock file the
89first time it needs it. It should never be removed.
90
91
92Main Pitfall
93------------
94
95The OS-specific configuration files that are used to build Exim specify the use
96of Berkeley DB on those systems where it is known to be standard. In the
97absence of any special configuration options, Exim uses the ndbm set of
98functions to control its dbm databases. This should work with any of the dbm
99libraries because those that are not ndbm have compatibility interfaces.
100However, there is one awful pitfall:
101
102Exim #includes a header file called ndbm.h which defines the functions and the
103interface data block; gdbm and DB 1.x provide their own versions of this header
104file, later DB versions do not. If it should happen that the wrong version of
105nbdm.h is seen by Exim, it may compile without error, but fail to operate
106correctly at runtime.
107
108This situation can easily arise when more than one dbm library is installed on
109a single host. For example, if you decide to use DB 1.x on a system where gdbm
110is the standard library, unless you are careful in setting up the include
111directories for Exim, it may see gdbm's ndbm.h file instead of DB's. The
112situation is even worse with later versions of DB, which do not provide an
113ndbm.h file at all.
114
115One way out of this for gdbm and any of the versions of DB is to configure Exim
116to call the DBM library in its native mode instead of via the ndbm
117compatibility interface, thus avoiding the use of ndbm.h. This is done by
118setting the USE_DB configuration option if you are using Berkeley DB, or
119USE_GDBM if you are using gdbm. This is the recommended approach.
120
121
122NDBM
123----
124
125The ndbm library holds its data in two files, with extensions .dir and .pag.
126This makes atomic updating of, for example, alias files, difficult, because
127simple renaming cannot be used without some risk. However, if your system has
128ndbm installed, Exim should compile and run without any problems.
129
130
131GDBM
132----
133
134The gdbm library, when called via the ndbm compatibility interface, makes two
135hard links to a single file, with extensions .dir and .pag. As mentioned above,
136gdbm provides its own version of the ndbm.h header, and you must ensure that
137this is seen by Exim rather than any other version. This is not likely to be a
138problem if gdbm is the only dbm library on your system.
139
140If gdbm is called via the native interface (by setting USE_GDBM in your
141Local/Makefile), it uses a single file, with no extension on the name, and the
142ndbm.h header is not required.
143
144The gdbm library does its own locking of the single file that it uses. From
145version 1.80 onwards, Exim locks on an entirely separate file before accessing
146a hints database, so gdbm's locking should always succeed.
147
148
149Berkeley DB 1.8x
150----------------
151
1521.85 was the most widespread DB 1.x release; there is also a 1.86 bug-fix
153release, but the belief is that the bugs it fixes will not affect Exim.
154However, maintenance for 1.x releases has been phased out.
155
156This dbm library can be called by Exim in one of two ways: via the ndbm
157compatibility interface, or via its own native interface. There are two
158advantages to doing the latter: (1) you don't run the risk of Exim's seeing the
159"wrong" version of the ndbm.h header, as described above, and (2) the
2ea97746 160performance is better. It is therefore recommended that you set USE_DB=yes in an
420a0d19
CE
161appropriate Local/Makefile-xxx file. (If you are compiling for just one OS, it
162can go in Local/Makefile itself.)
163
164When called via the compatibility interface, DB 1.x creates a single file with
165a .db extension. When called via its native interface, no extension is added to
166the file name handed to it.
167
168DB 1.x does not do any locking of its own.
169
170
171Berkeley DB 2.x
172---------------
173
174DB 2.x was released in 1997. It is a major re-implementation and its native
175interface is incompatible with DB 1.x, though a compatibility interface was
176introduced in DB 2.1.0, and there is also an ndbm.h compatibility interface.
177
178Like 1.x, it can be called from Exim via the ndbm compatibility interface or
179via its native interface, and once again setting USE_DB in order to get the
180native interface is recommended. If USE_DB is *not* set, then you will have to
181provide a suitable version of ndbm.h, because one does not come with the DB 2.x
182distribution. A suitable version is:
183
184 /*************************************************
185 * ndbm.h header for DB 2.x *
186 *************************************************/
187
188 /* This header should replace any other version of ndbm.h when Berkeley DB
189 version 2.x is in use via the ndbm compatibility interface. Otherwise, any
190 extant version of ndbm.h may cause programs to misbehave. There doesn't seem
191 to be a version of ndbm.h supplied with DB 2.x, so I made this for myself.
192
193 Philip Hazel 12/Jun/97
194 */
195
196 #define DB_DBM_HSEARCH
197 #include <db.h>
198
199 /* End */
200
201When called via the compatibility interface, DB 2.x creates a single file with
202a .db extension. When called via its native interface, no extension is added to
203the file name handed to it.
204
205DB 2.x does not do any automatic locking of its own; it does have a set of
206functions for various forms of locking, but Exim does not use them.
207
208
209Berkeley DB 3.x
210---------------
211
212DB 3.0 was released in November 1999 and 3.1 in June 2000. The 3.x series is a
213development of the 2.x series and the above comments apply. Exim can
214automatically distinguish between the different versions, so it copes with the
215changes to the API without needing any special configuration.
216
217When Exim creates a DBM file using DB 3.x (e.g. when creating one of its hints
218databases), it specified the "hash" format. However, when it opens a DB 3 file
219for reading only, it specifies "unknown". This means that it can read DB 3
220files in other formats that are created by other programs.
221
222
223Berkeley DB 4.x
224---------------
225
2ea97746 226The 4.x series is a development of the 2.x and 3.x series, and the above
420a0d19
CE
227comments apply.
228
229
230tdb
231---
232
233tdb 1.0.2 was released in September 2000. Its origin is the database functions
234that are used by the Samba project.
235
236
237
238Testing Exim's dbm handling
239---------------------------
240
241Because there have been problems with dbm file locking in the past, I built
242some testing code for Exim's dbm functions. This is very much a lash-up, but it
243is documented here so that anybody who wants to check that their configuration
244is locking properly can do so. Now that Exim does the locking on an entirely
245separate file, locking problems are much less likely, but this code still
246exists, just in case. Proceed as follows:
247
248. Build Exim in the normal way. This ensures that all the makesfiles etc. get
249 set up.
250
251. From within the build directory, obey "make test_dbfn". This makes a binary
252 file called test_dbfn. If you are experimenting with different configurations
253 you *must* do "make makefile" after changing anything, before obeying "make
254 test_dbfn" again, because the make target for test_dbfn isn't integrated
255 with the making of the makefile.
256
257. Identify a scratch directory where you have write access. Create a sub-
258 directory called "db" in the scratch directory.
259
260. Type the command "test_dbfn <scratch-directory>". This will output some
261 general information such as
262
263 Exim's db functions tester: interface type is db (v2)
264 DBM library: Berkeley DB: Sleepycat Software: DB 2.1.0: (6/13/97)
265 USE_DB is defined
266
267 It then says
268
269 Test the functions
270 >
271
272. At this point you can type commands to open a dbm file and read and write
273 data in it. First type the command "open <name>", e.g. "open junk". The
274 response should look like this
275
276 opened DB file <scratch-directory>/db/junk: flags=102
277 Locked
278 opened 0
279 >
280
281 The tester will have created a dbm file within the db directory of the
282 scratch directory. It will also have created a file with the extension
283 ".lockfile" in the same directory. Unlike Exim itself, it will not create
284 the db directory for itself if it does not exist.
285
286. To test the locking, don't type anything more for the moment. You now need to
287 set up another process running the same test_dbfn command, e.g. from a
288 different logon to the same host. This time, when you attempt to open the
289 file it should fail after a minute with a timeout error because it is
290 already in use.
291
292. If the second process doesn't produce any error message, but gets back to the
293 > prompt, then the locking is not working properly.
294
295. You can check that the second process gets the lock when the first process
296 releases it by exiting from the first process with ^D, q, or quit; or by
297 typing the command "close".
298
299. There are some other commands available that are not related to locking:
300
301 write <key> <data>
302 e.g.
303 write abcde the quick brown fox
304
305 writes a record to the database,
306
307 read <key>
308 delete <key>
309
310 read and delete a record, respectively, and
311
312 scan
313
314 scans the entire database. Note that the database is purely for testing the
315 dbm functions. It is *not* one of Exim's regular databases, and you should
316 not try running this testing program on any of Exim's real database
317 files.
318
319Philip Hazel
320Last update: June 2002