Base revisions
[ntk/apt.git] / doc / files.sgml
1 <!doctype debiandoc system>
2 <!-- -*- mode: sgml; mode: fold -*- -->
3 <book>
4 <title>APT Files</title>
5
6 <author>Jason Gunthorpe <email>jgg@debian.org</email></author>
7 <version>$Id: files.sgml,v 1.1 1998/07/02 02:58:12 jgg Exp $</version>
8
9 <abstract>
10 This document describes the complete implementation and format of the
11 installed APT directory structure. It also serves as guide to how APT
12 views the Debian archive.
13 </abstract>
14
15 <copyright>
16 Copyright &copy; Jason Gunthorpe, 1998.
17 <p>
18 "APT" and this document are free software; you can redistribute them and/or
19 modify them under the terms of the GNU General Public License as published
20 by the Free Software Foundation; either version 2 of the License, or (at your
21 option) any later version.
22
23 <p>
24 For more details, on Debian GNU/Linux systems, see the file
25 /usr/doc/copyright/GPL for the full license.
26 </copyright>
27
28 <toc sect>
29
30 <chapt>Introduction
31 <!-- General {{{ -->
32 <!-- ===================================================================== -->
33 <sect>General
34
35 <p>
36 This document serves two purposes. The first is to document the installed
37 directory structure and the format and purpose of each file. The second
38 purpose is to document how APT views the Debian archive and deals with
39 multiple package files.
40
41 <p>
42 The var directory structure is as follows:
43 <example>
44 /var/state/apt/
45 lists/
46 partial/
47 xstatus
48 /var/cache/apt/
49 pkgcache.bin
50 srcpkgcache.bin
51 archives/
52 partial/
53 /etc/apt/
54 sources.list
55 cdromdevs.list
56 /usr/lib/apt/
57 methods/
58 cdrom
59 ftp
60 http
61 </example>
62
63 <p>
64 As is specified in the FHS 2.0 /var/state/apt is used for application
65 data that is not expected to be user modified. /var/cache/apt is used
66 for regeneratable data and is where the package cache and downloaded .debs
67 go.
68 </sect>
69 <!-- }}} -->
70
71 <chapt>Files
72 <!-- Distribution Source List {{{ -->
73 <!-- ===================================================================== -->
74 <sect>Distribution Source list (sources.list)
75
76 <p>
77 The distribution source list is used to locate archives of the debian
78 distribution. It is designed to support any number of active sources and to
79 support a mix of source media. The file lists one source per line, with the
80 fastest source listed first. The format of each line is:
81
82 <p>
83 <var>type ui args</var>
84
85 <p>
86 The first item, <var>type</var>, indicates the format for the remainder
87 of the line. It is designed to indicate the structure of the distribution
88 the line is talking about. Currently the only defined value is <em>deb</em>
89 which indicates a standard debian archive with a dists dir.
90
91 <sect1>The deb Type
92 <p>
93 The <em>deb</em> type is to be a typical two level debian distributions,
94 dist/<var>distribution</var>/<var>component</var>. Typically distribution
95 is one of stable, unstable or frozen while component is one of main,
96 contrib, non-free or non-us. The format for the deb line is as follows:
97
98 <p>
99 deb <var>uri</var> <var>distribution</var> <var>compontent</var>
100 [<var>component</var> ...]
101
102 <p>
103 <var>uri</var> for the <em>deb</em> type must specify the base of the
104 debian distribution. APT will automatically generate the proper longer
105 URIs to get the information it needs. <var>distribution</var> can specify
106 an exact path, in this case the components must be omitted and
107 <var>distribution</var> must end in a slash.
108
109 <p>
110 Since only one distribution can be specified per deb line it may be
111 necessary to list a number of deb lines for the same URI. APT will
112 sort the URI list after it has generated a complete set to allow
113 connection reuse. It is important to order things in the sourcelist
114 from most prefered to least prefered (fastest to slowest).
115 </sect1>
116
117 <sect1>URI specification
118 <p>
119 URIs in the source list support a large number of access schemes.
120
121 <taglist>
122 <tag>cdrom<item>
123 The cdrom scheme is special in that If Modifed Since queries are never
124 performed and that APT knows how to match a cdrom to the name it
125 was given when first inserted. It does this by examining the date
126 and size of the package file. APT also knows all of the possible
127 prefix paths for the cdrom drives and that the user should be prompted
128 to insert a CD if it cannot be found. The path is relative to an
129 arbitary mount point (of APT's choosing) and must not start with a
130 slash. The first pathname component is the given name and is purely
131 descriptive and of the users choice. However, if a file in the root of
132 the cdrom is called 'cdname' its contents will be used instead of
133 prompting. The name serves as a tag for the cdrom and should be unique.
134 APT will track the CDROM's based on their tag and package file
135 properties.
136 <example>
137 cdrom:Debian 1.3/debian
138 </example>
139
140 <tag>http<item>
141 This scheme specifies a HTTP server for the debian archive. HTTP is prefered
142 over FTP because If Modified Since queries against the Package file are
143 possible. Newer HTTP protcols may even support reget which would make
144 http the protocol of choice.
145 <example>
146 http://www.debian.org/archive
147 </example>
148
149 <tag>ftp<item>
150 This scheme specifies a FTP connection to the server. FTP is limited because
151 there is no support for IMS and is hard to proxy over firewalls.
152 <example>
153 ftp://ftp.debian.org/debian
154 </example>
155
156 <tag>file<item>
157 The file scheme allows an arbitary directory in the file system to be
158 considered as a debian archive. This is usefull for NFS mounts and
159 local mirrors/archives.
160 <example>
161 file:/var/debian
162 </example>
163
164 <tag>mirror<item>
165 The mirror scheme is special in that it does not specify the location of a
166 debian archive but specifies the location of a list of mirrors to use
167 to access the archive. Some technique will be used to determine the
168 best choice for a mirror. The mirror file is specified in the Mirror File
169 section. If/when URIs take off they should obsolete this field.
170 <example>
171 mirror:http://www.debian.org/archivemirrors
172 </example>
173
174 <tag>smb<item>
175 A possible future expansion may be to have direct support for smb (Samba
176 servers).
177 <example>
178 smb://ftp.kernel.org/pub/mirrors/debian
179 </example>
180 </taglist>
181 </sect1>
182
183 <sect1>Hashing the URI
184 <p>
185 All permanent information aquired from any of the sources is stored in the
186 lists directory. Thus, there must be a way to relate the filename in the
187 lists directory to a line in the sourcelist. To simplify things this is
188 done by quoting the URI and treating ='s as quoteable characters and
189 converting / to =. The URI spec says this is done by converting a
190 sensitive character into %xx where xx is the hexadecimal representation
191 from the ascii character set. Examples:
192
193 <example>
194 http://www.debian.org/archive/dists/stable/binary-i386/Packages
195 /var/state/apt/lists/www.debian.org=archive=dists=stable=binary-i386=Packages
196
197 cdrom:Debian 1.3/debian/Packages
198 /var/state/apt/info/Debian%201.3=debian=Packages
199 </example>
200
201 <p>
202 The other alternative that was considered was to use a deep directory
203 structure but this poses two problems, it makes it very difficult to prune
204 directories back when sources are no longer used and complicates the handling
205 of the partial directory. This gives a very simple way to deal with all
206 of the situations that can arise. The equals sign was choosen on the
207 suggestion of Manoj because it is very infrequently used in filenames.
208 Also note that the same rules described in the <em>Archive Directory</>
209 section regarding the partial sub dir apply here as well.
210 </sect1>
211
212 </sect>
213 <!-- }}} -->
214 <!-- Extra Status {{{ -->
215 <!-- ===================================================================== -->
216 <sect>Extra Status File (xstatus)
217
218 <p>
219 The extra status file serves the same purpose as the normal dpkg status file
220 (/var/lib/dpkg/status) except that it stores information unique to diety.
221 This includes the autoflag, target distribution and version and any other
222 uniqe features that come up over time. It duplicates nothing from the normal
223 dpkg status file. Please see other APT documentation for a discussion
224 of the exact internal behavior of these fields. The Package field is
225 placed directly before the new fields to indicate which package they
226 apply to. The new fields are as follows:
227
228 <taglist>
229 <tag>X-Auto<item>
230 The Auto flag can be Yes or No and controls whether the package is in
231 auto mode.
232
233 <tag>X-TargetDist<item>
234 The TargetDist item indicates which distribution versions are offered for
235 installation from. It should be stable, unstable or frozen.
236
237 <tag>X-TargetVersion<item>
238 The target version item is set if the user selects a specific version, it
239 overrides the TargetDist selection if both are present.
240 </taglist>
241 </sect>
242 <!-- }}} -->
243 <!-- Binary Package Cache {{{ -->
244 <!-- ===================================================================== -->
245 <sect>Binary Package Cache (pkgcache.bin)
246
247 <p>
248 Please see cache.sgml for a complete description of what this file is. The
249 cache file is updated whenever the contents of the lists directory changes.
250 If the cache is erased, corrupted or of a non-matching version it will
251 be automatically rebuilt by all of the tools that need it.
252 <em>srcpkgcache.bin</> contains a cache of all of the package files in the
253 source list. This allows regeneration of the cache when the status files
254 change to use a prebuilt version for greater speed.
255 </sect>
256 <!-- }}} -->
257 <!-- Downloads Directory {{{ -->
258 <!-- ===================================================================== -->
259 <sect>Downloads Directory (archives)
260
261 <p>
262 The archives directory is where all downloaded .deb archives go. When the
263 file transfer is initiated the deb is placed in partial. Once the file
264 is fully downloaded and its MD5 hash and size are verifitied it is moved
265 from partial into archives/. Any files found in archives/ can be assumed
266 to be verified.
267
268 <p>
269 No dirctory structure is transfered from the receiving site and all .deb
270 file names conform to debian conventions. No short (msdos) filename should
271 be placed in archives. If the need arises .debs should be unpacked, scanned
272 and renamed to their correct internal names. This is mostly to prevent
273 file name conflicts but other programs may depend on this if convenient.
274 Downloaded .debs must be found in one of the package lists with an exact
275 name + version match..
276 </sect>
277 <!-- }}} -->
278 <!-- The Methods Directory {{{ -->
279 <!-- ===================================================================== -->
280 <sect> The Methods Directory (/usr/lib/apt/methods)
281
282 <p>
283 Like dselect, APT will support plugable acquisition methods to complement
284 its internaly supported methods. The files in
285 this directory are execultables named after the URI type. APT will
286 sort the required URIs and spawn these programs giving a full sorted, quoted
287 list of URIs.
288
289 <p>
290 The interface is simple, the program will be given a list
291 of URIs on the command line. The URIs will be in pairs, the first
292 being the actual URI and the second being the filename to write the data to.
293 The current directory will be set properly by APT and it is
294 expected the method will put files relative to the current directory.
295 The output of these programs is strictly speficied. The programs must accept
296 nothing from stdin (stdin will be an invalid fd) and they must output
297 status information to stdout according to the format below.
298 Stderr will be redirected to the logging facility.
299
300 <p>
301 Each line sent to stdout must be a line that has a single letter and a
302 space. Strings after the first letter do not need quoting, they are taken
303 as is till the end of the line. The tag letters, listed in expected order,
304 is as follows:
305
306 <taglist>
307
308 <tag>F - Change URI<item>
309 This specifies a change in URI. All information after this will be applied
310 to the new URI. When the URI is changed it is assumed that the old URI has
311 completed unless an error is set. The format is <var>F URI</>
312
313 <tag>S - Object Size<item>
314 This specifies the expected size of the object. APT will use this to
315 compute percent done figures. If it is not sent then a kilobyte meter
316 will be used instead of a percent display. The foramat is <var>S INTEGER</>
317
318 <tag>E - Error Information<item>
319 Exactly one line of error information can be set for each URI. The
320 information will be summarized for the user. If an E tag is send before
321 any F tags then the error is assumed to be a fatal method error and all URI
322 fetches for that method are aborted with that error string. The format
323 is <var>E String</>
324
325 <tag>I - Informative progress information<item>
326 The I tag allows the method to specify the status of the connection.
327 Typically the GUI will show the last recieved I line. The format is
328 <var>I String</> As a general rule an I tag should be ommitted before a
329 lengthy operation only. Things that always take a short period are not
330 suited for I tags. I tags should change wnenever the methods state changes.
331 Some standard forms, in order of occurance, are <var>Connecting to SITE</>,
332 <var>Connecting to SITE (1.1.1.1)</>, <var>Waiting for file</>,
333 <var>Authenticating</>, <var>Downloading</>, <var>Resuming (size)</>,
334 <var>Computing MD5</> <var>I</> lines should never print out information that
335 APT is already aware of, such as file names.
336
337 <tag>R - Set final path<item>
338 The R tag allows the method to tell APT that the file is present in the
339 local file system. APT might copy it into a the download directory. The format
340 is <var>R String</>
341
342 <tag>M - MD5Sum of the file<item>
343 The method is expected to compute the md5 hash on the fly as the download
344 progresses. The final md5 of the file is to be output when the file is
345 completed. If the md5 is not output it will not be checked! Some methods
346 such as the file method will not check md5's because they are most
347 commonly used on mirrors or local CD-ROM's, a paranoid option may be
348 provided in future to force checking. The format is <var>M MD5-String</>
349
350 <tag>L - Log output<item>
351 This tag indicates a string that should be dumped to some log file. The
352 string is for debugging and is not ment to be seen by the user. The format
353 is <var>L String</> Log things should only be used in a completed method
354 if they have special relavence to what is happening.
355 </taglist>
356
357 <p>
358 APT monitors the progress of the transfer by watching the file size. This
359 means the method must not create any temp files and must use a fairly small
360 buffer. The method is also responsible for If-Modified-Since (IMS) queries
361 for the object. It should check ../outputname to get the time stamp but not
362 size. The size may be different because the file was uncompressed after
363 it was transfed. A method must <em>never</> change the file in .., it may
364 only change the output file in the current directory.
365
366 <p>
367 The APT 'http' program is the reference implementation of this specification,
368 it implements all of the features a method is expected to do.
369 </sect>
370 <!-- }}} -->
371 <!-- The Mirror List {{{ -->
372 <!-- ===================================================================== -->
373 <sect> The Mirror List
374
375 <p>
376 The mirror list is stored on the primary debian web server (www.debian.org)
377 and contains a machine readable list of all known debian mirrors. The mirror
378 URI type will cause this list to be downloaded and considered. It has the
379 same form as the source list. When the source list specifies mirror
380 as the target the mirror list is scanned to find the nescessary parts for
381 the requested distributions and components. This means the user could
382 have a line like:
383
384 <var>deb mirror:http://www.debian.org/mirrorlist stable main non-us</var>
385
386 which would likely cause APT to choose two separate sites to download from,
387 one for main and another for non-us.
388
389 <p>
390 Some form of network measurement will have to be used to gauge performance
391 of each of the mirrors. This will be discussed later, initial versions
392 will use the first found URI.
393 </sect>
394 <!-- }}} -->
395 <!-- The Release File {{{ -->
396 <!-- ===================================================================== -->
397 <sect> The Release File
398
399 <p>
400 This file plays and important role in how APT presents the archive to the
401 user. Its main purpose is to present a descriptive name for the source
402 of each version of each package. It also is used to detect when new versions
403 of debian are released. It augments the package file it is associated with
404 by providing meta information about the entire archive which the Packages
405 file describes.
406
407 <p>
408 The full name of the distribution for presentation to the user is formed
409 as 'label version archive', with a possible extended name being
410 'label version archive component'.
411
412 <p>
413 The file is formed as the package file (RFC-822) with the following tags
414 defined:
415
416 <taglist>
417 <tag>Archive<item>
418 This is the common name we give our archives, such as <em>stable</> or
419 <em>unstable</>.
420
421 <tag>Component<item>
422 Referes to the sub-component of the archive, <em>main</>, <em>contrib</>
423 etc.
424
425 <tag>Version<item>
426 This is a version string with the same properties as in the Packages file.
427 It represents the release level of the archive.
428
429 <tag>Origin<item>
430 This specifies who is providing this archive. In the case of Debian the
431 string will read 'Debian'. Other providers may use their own string
432
433 <tag>Label<item>
434 This carries the encompassing name of the distribution. For Debian proper
435 this field reads 'Debian'. For derived distributions it should contain their
436 proper name.
437
438 <tag>Architecture<item>
439 When the archive has packages for a single architecture then the Architecture
440 is listed here. If a mixed set of systems are represented then this should
441 contain the keyword <em>mixed</em>.
442
443 <tag>NotAutomatic<item>
444 A Yes/No flag indicating that the archive is extremely unstable and its
445 version's should never be automatically selected. This is to be used by
446 experimental.
447
448 <tag>Description<item>
449 Description is used to describe the release. For instance experimental would
450 contain a warning that the packages have problems.
451 </taglist>
452
453 <p>
454 The location of the Release file in the archive is very important, it must
455 be located in the same location as the packages file so that it can be
456 located in all situations. The following is an example for the current stable
457 release, 1.3.1r6
458
459 <example>
460 Archive: stable
461 Compontent: main
462 Version: 1.3.1r6
463 Origin: Debian
464 Label: Debian
465 Architecture: i386
466 </example>
467
468 This is an example of experimental,
469 <example>
470 Archive: experimental
471 Version: 0
472 Origin: Debian
473 Label: Debian
474 Architecture: mixed
475 NotAutomatic: Yes
476 </example>
477
478 And unstable,
479 <example>
480 Archive: unstable
481 Compontent: main
482 Version: 2.1
483 Origin: Debian
484 Label: Debian
485 Architecture: i386
486 </example>
487
488 </sect>
489 <!-- }}} -->
490
491 </book>