add wrapper around dpkg to be able to use it easily in the tests
[ntk/apt.git] / doc / files.sgml
1 <!-- -*- mode: sgml; mode: fold -*- -->
2 <!doctype debiandoc PUBLIC "-//DebianDoc//DTD DebianDoc//EN">
3 <book>
4 <title>APT Files</title>
5
6 <author>Jason Gunthorpe <email>jgg@debian.org</email></author>
7 <version>$Id: files.sgml,v 1.12 2003/04/26 23:26:13 doogie Exp $</version>
8
9 <abstract>
10 This document describes the complete implementation and format of the
11 installed APT directory structure. It also serves as guide to how APT
12 views the Debian archive.
13 </abstract>
14
15 <copyright>
16 Copyright &copy; Jason Gunthorpe, 1998-1999.
17 <p>
18 "APT" and this document are free software; you can redistribute them and/or
19 modify them under the terms of the GNU General Public License as published
20 by the Free Software Foundation; either version 2 of the License, or (at your
21 option) any later version.
22
23 <p>
24 For more details, on Debian GNU/Linux systems, see the file
25 /usr/share/common-licenses/GPL for the full license.
26 </copyright>
27
28 <toc sect>
29
30 <chapt>Introduction
31 <!-- General {{{ -->
32 <!-- ===================================================================== -->
33 <sect>General
34
35 <p>
36 This document serves two purposes. The first is to document the installed
37 directory structure and the format and purpose of each file. The second
38 purpose is to document how APT views the Debian archive and deals with
39 multiple package files.
40
41 <p>
42 The var directory structure is as follows:
43 <example>
44 /var/lib/apt/
45 lists/
46 partial/
47 periodic/
48 extended_states
49 cdroms.list
50 /var/cache/apt/
51 archives/
52 partial/
53 pkgcache.bin
54 srcpkgcache.bin
55 /etc/apt/
56 sources.list.d/
57 apt.conf.d/
58 preferences.d/
59 trusted.gpg.d/
60 sources.list
61 apt.conf
62 apt_preferences
63 trusted.gpg
64 /usr/lib/apt/
65 methods/
66 bzip2
67 cdrom
68 copy
69 file
70 ftp
71 gpgv
72 gzip
73 http
74 https
75 lzma
76 rred
77 rsh
78 ssh
79 </example>
80
81 <p>
82 As is specified in the FHS 2.1 /var/lib/apt is used for application
83 data that is not expected to be user modified. /var/cache/apt is used
84 for regeneratable data and is where the package cache and downloaded .debs
85 go. /etc/apt is the place where configuration should happen and
86 /usr/lib/apt is the place where the apt and other packages can place
87 binaries which can be used by the acquire system of APT.
88 </sect>
89 <!-- }}} -->
90
91 <chapt>Files
92 <!-- Distribution Source List {{{ -->
93 <!-- ===================================================================== -->
94 <sect>Files and fragment directories in /etc/apt
95
96 <p>
97 All files in /etc/apt are used to modify specific aspects of APT. To enable
98 other packages to ship needed configuration herself all these files have
99 a fragment directory packages can place their files in instead of mangling
100 with the main files. The main files are therefore considered to be only
101 used by the user and not by a package. The documentation omits this directories
102 most of the time to be easier readable, so every time the documentation includes
103 a reference to a main file it really means the file or the fragment directories.
104
105 </sect>
106
107 <sect>Distribution Source list (sources.list)
108
109 <p>
110 The distribution source list is used to locate archives of the debian
111 distribution. It is designed to support any number of active sources and to
112 support a mix of source media. The file lists one source per line, with the
113 fastest source listed first. The format of each line is:
114
115 <p>
116 <var>type uri args</var>
117
118 <p>
119 The first item, <var>type</var>, indicates the format for the remainder
120 of the line. It is designed to indicate the structure of the distribution
121 the line is talking about. Currently the only defined value is <em>deb</em>
122 which indicates a standard debian archive with a dists dir.
123
124 <sect1>The deb Type
125 <p>
126 The <em>deb</em> type is to be a typical two level debian distributions,
127 dist/<var>distribution</var>/<var>component</var>. Typically distribution
128 is one of stable, unstable or testing while component is one of main,
129 contrib, non-free or non-us. The format for the deb line is as follows:
130
131 <p>
132 deb <var>uri</var> <var>distribution</var> <var>component</var>
133 [<var>component</var> ...]
134
135 <p>
136 <var>uri</var> for the <em>deb</em> type must specify the base of the
137 debian distribution. APT will automatically generate the proper longer
138 URIs to get the information it needs. <var>distribution</var> can specify
139 an exact path, in this case the components must be omitted and
140 <var>distribution</var> must end in a slash.
141
142 <p>
143 Since only one distribution can be specified per deb line it may be
144 necessary to list a number of deb lines for the same URI. APT will
145 sort the URI list after it has generated a complete set to allow
146 connection reuse. It is important to order things in the sourcelist
147 from most preferred to least preferred (fastest to slowest).
148 </sect1>
149
150 <sect1>URI specification
151 <p>
152 URIs in the source list support a large number of access schemes which
153 are listed in the sources.list manpage and can be further extended by
154 transport binaries placed in /usr/lib/apt/methods. The most important
155 builtin schemes are:
156
157 <taglist>
158 <tag>cdrom<item>
159 The cdrom scheme is special in that If Modified Since queries are never
160 performed and that APT knows how to match a cdrom to the name it
161 was given when first inserted. APT also knows all of the possible
162 mount points the cdrom drives and that the user should be prompted
163 to insert a CD if it cannot be found. The path is relative to an
164 arbitrary mount point (of APT's choosing) and must not start with a
165 slash. The first pathname component is the given name and is purely
166 descriptive and of the users choice. However, if a file in the root of
167 the cdrom is called '.disk/info' its contents will be used instead of
168 prompting. The name serves as a tag for the cdrom and should be unique.
169 <example>
170 cdrom:Debian 1.3/debian
171 </example>
172
173 <tag>http<item>
174 This scheme specifies a HTTP server for the debian archive. HTTP is preferred
175 over FTP because If Modified Since queries against the Package file are
176 possible as well as deep pipelining and resume capabilities.
177 <example>
178 http://www.debian.org/archive
179 </example>
180
181 <tag>ftp<item>
182 This scheme specifies a FTP connection to the server. FTP is limited because
183 there is no support for IMS and is hard to proxy over firewalls.
184 <example>
185 ftp://ftp.debian.org/debian
186 </example>
187
188 <tag>file<item>
189 The file scheme allows an arbitrary directory in the file system to be
190 considered as a debian archive. This is useful for NFS mounts and
191 local mirrors/archives.
192 <example>
193 file:/var/debian
194 </example>
195 </taglist>
196 </sect1>
197
198 <sect1>Hashing the URI
199 <p>
200 All permanent information acquired from any of the sources is stored in the
201 lists directory. Thus, there must be a way to relate the filename in the
202 lists directory to a line in the sourcelist. To simplify things this is
203 done by quoting the URI and treating _'s as quoteable characters and
204 converting / to _. The URI spec says this is done by converting a
205 sensitive character into %xx where xx is the hexadecimal representation
206 from the ASCII character set. Examples:
207
208 <example>
209 http://www.debian.org/archive/dists/stable/binary-i386/Packages
210 /var/lib/apt/lists/www.debian.org_archive_dists_stable_binary-i386_Packages
211
212 cdrom:Debian 1.3/debian/Packages
213 /var/lib/apt/info/Debian%201.3_debian_Packages
214 </example>
215
216 <p>
217 The other alternative that was considered was to use a deep directory
218 structure but this poses two problems, it makes it very difficult to prune
219 directories back when sources are no longer used and complicates the handling
220 of the partial directory. This gives a very simple way to deal with all
221 of the situations that can arise. Also note that the same rules described in
222 the <em>Archive Directory</> section regarding the partial sub dir apply
223 here as well.
224 </sect1>
225
226 </sect>
227 <!-- }}} -->
228 <!-- Extended Status {{{ -->
229 <!-- ===================================================================== -->
230 <sect>Extended States File (extended_states)
231
232 <p>
233 The extended_states file serves the same purpose as the normal dpkg status file
234 (/var/lib/dpkg/status) except that it stores information unique to apt.
235 This includes currently only the autoflag but is open to store more
236 unique data that come up over time. It duplicates nothing from the normal
237 dpkg status file. Please see other APT documentation for a discussion
238 of the exact internal behaviour of these fields. The Package and the
239 Architecture field are placed directly before the new fields to indicate
240 which package they apply to. The new fields are as follows:
241
242 <taglist>
243 <tag>Auto-Installed<item>
244 The Auto flag can be 1 (Yes) or 0 (No) and controls whether the package
245 was automatical installed to satisfy a dependency or if the user requested
246 the installation
247 </taglist>
248 </sect>
249 <!-- }}} -->
250 <!-- Binary Package Cache {{{ -->
251 <!-- ===================================================================== -->
252 <sect>Binary Package Cache (srcpkgcache.bin and pkgcache.bin)
253
254 <p>
255 Please see cache.sgml for a complete description of what this file is. The
256 cache file is updated whenever the contents of the lists directory changes.
257 If the cache is erased, corrupted or of a non-matching version it will
258 be automatically rebuilt by all of the tools that need it.
259 <em>srcpkgcache.bin</> contains a cache of all of the package files in the
260 source list. This allows regeneration of the cache when the status files
261 change to use a prebuilt version for greater speed.
262 </sect>
263 <!-- }}} -->
264 <!-- Downloads Directory {{{ -->
265 <!-- ===================================================================== -->
266 <sect>Downloads Directory (archives)
267
268 <p>
269 The archives directory is where all downloaded .deb archives go. When the
270 file transfer is initiated the deb is placed in partial. Once the file
271 is fully downloaded and its MD5 hash and size are verified it is moved
272 from partial into archives/. Any files found in archives/ can be assumed
273 to be verified.
274
275 <p>
276 No directory structure is transfered from the receiving site and all .deb
277 file names conform to debian conventions. No short (msdos) filename should
278 be placed in archives. If the need arises .debs should be unpacked, scanned
279 and renamed to their correct internal names. This is mostly to prevent
280 file name conflicts but other programs may depend on this if convenient.
281 A conforming .deb is one of the form, name_version_arch.deb. Our archive
282 scripts do not handle epochs, but they are necessary and should be re-inserted.
283 If necessary _'s and :'s in the fields should be quoted using the % convention.
284 It must be possible to extract all 3 fields by examining the file name.
285 Downloaded .debs must be found in one of the package lists with an exact
286 name + version match..
287 </sect>
288 <!-- }}} -->
289 <!-- The Methods Directory {{{ -->
290 <!-- ===================================================================== -->
291 <sect> The Methods Directory (/usr/lib/apt/methods)
292
293 <p>
294 The Methods directory is more fully described in the APT Methods interface
295 document.
296 </sect>
297 <!-- }}} -->
298 <!-- The Configuration File {{{ -->
299 <!-- ===================================================================== -->
300 <sect> The Configuration File (/etc/apt/apt.conf)
301
302 <p>
303 The configuration file (and the associated fragments directory
304 /etc/apt/apt.conf.d/) is described in the apt.conf manpage.
305 </sect>
306 <!-- }}} -->
307 <!-- The trusted.gpg File {{{ -->
308 <!-- ===================================================================== -->
309 <sect> The trusted.gpg File (/etc/apt/trusted.gpg)
310
311 <p>
312 The trusted.gpg file (and the files in the associated fragments directory
313 /etc/apt/trusted.gpg.d/) is a binary file including the keyring used
314 by apt to validate that the information (e.g. the Release file) it
315 downloads are really from the distributor it clams to be and is
316 unmodified and is therefore the last step in the chain of trust between
317 the archive and the end user. This security system is described in the
318 apt-secure manpage.
319 </sect>
320 <!-- }}} -->
321 <!-- The Release File {{{ -->
322 <!-- ===================================================================== -->
323 <sect> The Release File
324
325 <p>
326 This file plays an important role in how APT presents the archive to the
327 user. Its main purpose is to present a descriptive name for the source
328 of each version of each package. It also is used to detect when new versions
329 of debian are released. It augments the package file it is associated with
330 by providing meta information about the entire archive which the Packages
331 file describes.
332
333 <p>
334 The full name of the distribution for presentation to the user is formed
335 as 'label version archive', with a possible extended name being
336 'label version archive component'.
337
338 <p>
339 The file is formed as the package file (RFC-822) with the following tags
340 defined:
341
342 <taglist>
343 <tag>Archive<item>
344 This is the common name we give our archives, such as <em>stable</> or
345 <em>unstable</>.
346
347 <tag>Component<item>
348 Refers to the sub-component of the archive, <em>main</>, <em>contrib</>
349 etc. Component may be omitted if there are no components for this archive.
350
351 <tag>Version<item>
352 This is a version string with the same properties as in the Packages file.
353 It represents the release level of the archive.
354
355 <tag>Origin<item>
356 This specifies who is providing this archive. In the case of Debian the
357 string will read 'Debian'. Other providers may use their own string
358
359 <tag>Label<item>
360 This carries the encompassing name of the distribution. For Debian proper
361 this field reads 'Debian'. For derived distributions it should contain their
362 proper name.
363
364 <tag>Architecture<item>
365 When the archive has packages for a single architecture then the Architecture
366 is listed here. If a mixed set of systems are represented then this should
367 contain the keyword <em>mixed</em>.
368
369 <tag>NotAutomatic<item>
370 A Yes/No flag indicating that the archive is extremely unstable and its
371 version's should never be automatically selected. This is to be used by
372 experimental.
373
374 <tag>Description<item>
375 Description is used to describe the release. For instance experimental would
376 contain a warning that the packages have problems.
377 </taglist>
378
379 <p>
380 The location of the Release file in the archive is very important, it must
381 be located in the same location as the packages file so that it can be
382 located in all situations. The following is an example for the current stable
383 release, 1.3.1r6
384
385 <example>
386 Archive: stable
387 Component: main
388 Version: 1.3.1r6
389 Origin: Debian
390 Label: Debian
391 Architecture: i386
392 </example>
393
394 This is an example of experimental,
395 <example>
396 Archive: experimental
397 Version: 0
398 Origin: Debian
399 Label: Debian
400 Architecture: mixed
401 NotAutomatic: Yes
402 </example>
403
404 And unstable,
405 <example>
406 Archive: unstable
407 Component: main
408 Version: 2.1
409 Origin: Debian
410 Label: Debian
411 Architecture: i386
412 </example>
413
414 </sect>
415 <!-- }}} -->
416
417 </book>