Import Upstream version 1.8.5
[hcoop/debian/openafs.git] / doc / protocol / vldb-vol-spec.h
CommitLineData
805e021f
CE
1/*!
2 \addtogroup vldb-spec VLDB Server Interface
3 @{
4
5 \page title AFS-3 Programmer's Reference: Volume Server/Volume Location
6Server Interface
7
8 \author Edward R. Zayas
9Transarc Corporation
10\version 1.0
11\date 29 August 1991 14:48 Copyright 1991 Transarc Corporation All Rights
12Reserved FS-00-D165
13
14
15 \page chap1 Chapter 1: Overview
16
17 \section sec1-1 Section 1.1: Introduction
18
19\par
20This document describes the architecture and interfaces for two of the
21important agents of the AFS distributed file system, the Volume Server and the
22Volume Location Server. The Volume Server allows operations affecting entire
23AFS volumes to be executed, while the Volume Location Server provides a lookup
24service for volumes, identifying the server or set of servers on which volume
25instances reside.
26
27 \section sec1-2 Section 1.2: Volumes
28
29 \subsection sec1-2-1 Section 1.2.1: Definition
30
31\par
32The underlying concept manipulated by the two AFS servers examined by this
33document is the volume. Volumes are the basic mechanism for organizing the data
34stored within the file system. They provide the foundation for addressing,
35storing, and accessing file data, along with serving as the administrative
36units for replication, backup, quotas, and data motion between File Servers.
37\par
38Specifically, a volume is a container for a hierarchy of files, a connected
39file system subtree. In this respect, a volume is much like a traditional unix
40file system partition. Like a partition, a volume can be mounted in the sense
41that the root directory of the volume can be named within another volume at an
42AFS mount point. The entire file system hierarchy is built up in this manner,
43using mount points to glue together the individual subtrees resident within
44each volume. The root of this hierarchy is then mounted by each AFS client
45machine using a conventional unix mount point within the workstation's local
46file system. By convention, this entryway into the AFS domain is mounted on the
47/afs local directory. From a user's point of view, there is only a single mount
48point to the system; the internal mount points are generally transparent.
49
50 \subsection sec1-2-2 Section 1.2.2: Volume Naming
51
52\par
53There are two methods by which volumes may be named. The first is via a
54human-readable string name, and the second is via a 32-bit numerical
55identifier. Volume identifiers, whether string or numerical, must be unique
56within any given cell. AFS mount points may use either representation to
57specify the volume whose root directory is to be accessed at the given
58position. Internally, however, AFS agents use the numerical form of
59identification exclusively, having to translate names to the corresponding
6032-bit value.
61
62 \subsection sec1-2-3 Section 1.2.3: Volume Types
63
64\par
65There are three basic volume types: read-write, read-only, and backup volumes.
66\li Read-write: The data in this volume may be both read and written by those
67clients authorized to do so.
68\li Read-only: It is possible to create one or more read-only snapshots of
69read-write volumes. The read-write volume serving as the source image is
70referred to as the parent volume. Each read-only clone, or child, instance must
71reside on a different unix disk partition than the other clones. Every clone
72instance generated from the same parent read-write volume has the identical
73volume name and numerical volume ID. This is the reason why no two clones may
74appear on the same disk partition, as there would be no way to differentiate
75the two. AFS clients are allowed to read files and directories from read-only
76volumes, but cannot overwrite them individually. However, it is possible to
77make changes to the read-write parent and then release the contents of the
78entire volume to all the read-only replicas. The release operation fails if it
79does not reach the appropriate replication sites.
80\li Backup: A backup volume is a special instance of a read-only volume. While
81it is also a read-only snapshot of a given read-write volume, only one instance
82is allowed to exist at any one time. Also, the backup volume must reside on the
83same partition as the parent read-write volume from which it was created. It is
84from a backup volume that the AFS backup system writes file system data to
85tape. In addition, backup volumes may be mounted into the file tree just like
86the other volume types. In fact, by convention, the backup volume for each
87user's home directory subtree is typically mounted as OldFiles in that
88directory. If a user accidentally deletes a file that resides in the backup
89snapshot, the user may simply copy it out of the backup directly without the
90assistance of a system administrator, or any kind of tape restore operation.
91Backup volume are implemented in a copy-on-write fashion. Thus, backup volumes
92may be envisioned as consisting of a set of pointers to the true data objects
93in the base read-write volume when they are first created. When a file is
94overwritten in the read-write version for the first time after the backup
95volume was created, the original data is physically written to the backup
96volume, breaking the copyon-write link. With this mechanism, backup volumes
97maintain the image of the read-write volume at the time the snapshot was taken
98using the minimum amount of additional disk space.
99
100 \section sec1-3 Section 1.3: Scope
101
102\par
103This paper is a member of a documentation suite providing specifications of the
104operation and interfaces offered by the various AFS servers and agents. The
105scope of this work is to provide readers with a sufficiently detailed
106description of the Volume Location Server and the Volume Server so that they
107may construct client applications which call their RPC interface routines.
108
109 \section sec1-4 Section 1.4: Document Layout
110
111\par
112After this introductory portion of the document, Chapters 2 and 3 examine the
113architecture and RPC interface of the Volume Location Server and its replicated
114database. Similarly, Chapters 4 and 5 describe the architecture and RPC
115interface of the Volume Server.
116
117 \page chap2 Chapter 2: Volume Location Server Architecture
118
119 \section sec2-1 Section 2.1: Introduction
120
121\par
122The Volume Location Server allows AFS agents to query the location and basic
123status of volumes resident within the given cell. Volume Location Server
124functions may be invoked directly from authorized users via the vos utility.
125\par
126This chapter briefly discusses various aspects of the Volume Location Server's
127architecture. First, the need for volume location is examined, and the specific
128parties that call the Volume Location Server interface routines are identified.
129Then, the database maintained to provide volume location service, the Volume
130Location Database (VLDB), is examined. Finally, the vlserver process which
131implements the Volume Location Server is considered.
132\par
133As with all AFS servers, the Volume Location Server uses the Rx remote
134procedure call package for communication with its clients.
135
136 \section sec2-2 Section 2.2: The Need For Volume Location
137
138\par
139The Cache Manager agent is the primary consumer of AFS volume location service,
140on which it is critically dependent for its own operation. The Cache Manager
141needs to map volume names or numerical identifiers to the set of File Servers
142on which its instances reside in order to satisfy the file system requests it
143is processing on behalf of it clients. Each time a Cache Manager encounters a
144mount point for which it does not have location information cached, it must
145acquire this information before the pathname resolution may be successfully
146completed. Once the File Server set is known for a particular volume, the Cache
147Manager may then select the proper site among them (e.g. choosing the single
148home for a read-write volume, or randomly selecting a site from a read-only
149volume's replication set) and begin addressing its file manipulation operations
150to that specific server.
151\par
152While the Cache Manager consults the volume location service, it is not capable
153of changing the location of volumes and hence modifying the information
154contained therein. This capability to perform acts which change volume location
155is concentrated within the Volume Server. The Volume Server process running on
156each server machine manages all volume operations affecting that platform,
157including creations, deletions, and movements between servers. It must update
158the volume location database every time it performs one of these actions.
159\par
160None of the other AFS system agents has a need to access the volume location
161database for its site. Surprisingly, this also applies to the File Server
162process. It is only aware of the specific set of volumes that reside on the set
163of physical disks directly attached to the machine on which they execute. It
164has no knowlege of the universe of volumes resident on other servers, either
165within its own cell or in foreign cells.
166
167 \section sec2-3 Section 2.3: The VLDB
168
169\par
170The Volume Location Database (VLDB) is used to allow AFS application programs
171to discover the location of any volume within its cell, along with select
172information about the nature and state of that volume. It is organized in a
173very straightforward fashion, and uses the ubik [4] [5] facility to to provide
174replication across multiple server sites.
175
176 \subsection sec2-3-1 Section 2.3.1: Layout
177
178\par
179The VLDB itself is a very simple structure, and synchronized copies may be
180maintained at two or more sites. Basically, each copy consists of header
181information, followed by a linear (yet unbounded) array of entries. There are
182several associated hash tables used to perform lookups into the VLDB. The first
183hash table looks up volume location information based on the volume's name.
184There are three other hash tables used for lookup, based on volume ID/type
185pairs, one for each possible volume type.
186\par
187The VLDB for a large site may grow to contain tens of thousands of entries, so
188some attempts were made to make each entry as small as possible. For example,
189server addresses within VLDB entries are represented as single-byte indicies
190into a table containing the full longword IP addresses.
191\par
192A free list is kept for deleted VLDB entries. The VLDB will not grow unless all
193the entries on the free list have been exhausted, keeping it as compact as
194possible.
195
196 \subsection sec2-3-2 Section 2.3.2: Database Replication
197
198\par
199The VLDB, along with other important AFS databases, may be replicated to
200multiple sites to improve its availability. The ubik replication package is
201used to implement this functionality for the VLDB. A full description of ubik
202and of the quorum completion algorithm it implements may be found in [4] and
203[5]. The basic abstraction provided by ubik is that of a disk file replicated
204to multiple server locations. One machine is considered to be the
205synchronization site, handling all write operations on the database file. Read
206operations may be directed to any of the active members of the quorum, namely a
207subset of the replication sites large enough to insure integrity across such
208failures as individual server crashes and network partitions. All of the quorum
209members participate in regular elections to determine the current
210synchronization site. The ubik algorithms allow server machines to enter and
211exit the quorum in an orderly and consistent fashion. All operations to one of
212these replicated "abstract files" are performed as part of a transaction. If
213all the related operations performed under a transaction are successful, then
214the transaction is committed, and the changes are made permanent. Otherwise,
215the transaction is aborted, and all of the operations for that transaction are
216undone.
217
218 \section sec2-4 Section 2.4: The vlserver Process
219
220\par
221The user-space vlserver process is in charge of providing volume location
222service for AFS clients. This program maintains the VLDB replica at its
223particular server, and cooperates with all other vlserver processes running in
224the given cell to propagate updates to the database. It implements the RPC
225interface defined in the vldbint.xg definition file for the rxgen RPC stub
226generator program. As part of its startup sequence, it must discover the VLDB
227version it has on its local disk, move to join the quorum of replication sites
228for the VLDB, and get the latest version if the one it came up with was out of
229date. Eventually, it will synchronize with the other VLDB replication sites,
230and it will begin accepting calls.
231\par
232The vlserver program uses at most three Rx worker threads to listen for
233incoming Volume Location Server calls. It has a single, optional command line
234argument. If the string "-noauth" appears when the program is invoked, then
235vlserver will run in an unauthenticated mode where any individual is considered
236authorized to perform any VLDB operation. This mode is necessary when first
237bootstrapping an AFS installation.
238
239 \page chap3 Chapter 3: Volume Location Server Interface
240
241 \section sec3-1 Section 3.1: Introduction
242
243\par
244This chapter documents the API for the Volume Location Server facility, as
245defined by the vldbint.xg Rxgen interface file and the vldbint.h include file.
246Descriptions of all the constants, structures, macros, and interface functions
247available to the application programmer appear here.
248\par
249It is expected that Volume Location Server client programs run in user space,
250as does the associated vos volume utility. However, the kernel-resident Cache
251Manager agent also needs to call a subset of the Volume Location Server's RPC
252interface routines. Thus, a second Volume Location Server interface is
253available, built exclusively to satisfy the Cache Manager's limited needs. This
254subset interface is defined by the afsvlint.xg Rxgen interface file, and is
255examined in the final section of this chapter.
256
257 \section sec3-2 3.2: Constants
258
259\par
260This section covers the basic constant definitions of interest to the Volume
261Location Server application programmer. These definitions appear in the
262vldbint.h file, automatically generated from the vldbint.xg Rxgen interface
263file, and in vlserver.h.
264\par
265Each subsection is devoted to describing the constants falling into the
266following categories:
267\li Configuration and boundary quantities
268\li Update entry bits
269\li List-by-attribute bits
270\li Volume type indices
271\li States for struct vlentry
272\li States for struct vldbentry
273\li ReleaseType argument values
274\li Miscellaneous items
275
276 \subsection sec3-2-1 Section 3.2.1: Configuration and Boundary
277Quantities
278
279\par
280These constants define some basic system values, including configuration
281information.
282
283\par Name
284MAXNAMELEN
285\par Value
28665
287\par Description
288Maximum size of various character strings, including volume name fields in
289structures and host names.
290
291\par Name
292MAXNSERVERS
293\par Value
2948
295\par Description
296Maximum number of replications sites for a volume.
297
298\par Name
299MAXTYPES
300\par Value
3013
302\par Description
303Maximum number of volume types.
304
305\par Name
306VLDBVERSION
307\par Value
3081
309\par Description
310VLDB database version number
311
312\par Name
313HASHSIZE
314\par Value
3158,191
316\par Description
317Size of internal Volume Location Server volume name and volume ID hash tables.
318This must always be a prime number.
319
320\par Name
321NULLO
322\par Value
3230
324\par Description
325Specifies a null pointer value.
326
327\par Name
328VLDBALLOCCOUNT
329\par Value
33040
331\par Description
332Value used when allocating memory internally for VLDB entry records.
333
334\par Name
335BADSERVERID
336\par Value
337255
338\par Description
339Illegal Volume Location Server host ID.
340
341\par Name
342MAXSERVERID
343\par Value
34430
345\par Description
346Maximum number of servers appearing in the VLDB.
347
348\par Name
349MAXSERVERFLAG
350\par Value
3510x80
352\par Description
353First unused flag value in such fields as serverFlags in struct vldbentry and
354RepsitesNewFlags in struct VldbUpdateEntry.
355
356\par Name
357MAXPARTITIONID
358\par Value
359126
360\par Description
361Maximum number of AFS disk partitions for any one server.
362
363\par Name
364MAXBUMPCOUNT
365\par Value
3660x7fffffff
367\par Description
368Maximum interval that the current high-watermark value for a volume ID can be
369increased in one operation.
370
371\par Name
372MAXLOCKTIME
373\par Value
3740x7fffffff
375\par Description
376Maximum number of seconds that any VLDB entry can remain locked.
377
378\par Name
379SIZE
380\par Value
3811,024
382\par Description
383Maximum size of the name field within a struct.
384
385 \subsection sec3-2-2 Section 3.2.2: Update Entry Bits
386
387\par
388These constants define bit values for the Mask field in the struct
389VldbUpdateEntry. Specifically, setting these bits is equivalent to declaring
390that the corresponding field within an object of type struct VldbUpdateEntry
391has been set. For example, setting the VLUPDATE VOLUMENAME flag in Mask
392indicates that the name field contains a valid value.
393
394\par Name
395VLUPDATE VOLUMENAME
396\par Value
3970x0001
398\par Description
399If set, indicates that the name field is valid.
400
401\par Name
402VLUPDATE VOLUMETYPE
403\par Value
4040x0002
405\par Description
406If set, indicates that the volumeType field is valid.
407
408\par Name
409VLUPDATE FLAGS
410\par Value
4110x0004
412\par Description
413If set, indicates that the flags field is valid.
414
415\par Name
416VLUPDATE READONLYID
417\par Value
4180x0008
419\par Description
420If set, indicates that the ReadOnlyId field is valid.
421
422\par Name
423VLUPDATE BACKUPID
424\par Value
4250x0010
426\par Description
427If set, indicates that the BackupId field is valid.
428
429\par Name
430VLUPDATE REPSITES
431\par Value
4320x0020
433\par Description
434If set, indicates that the nModifiedRepsites field is valid.
435
436\par Name
437VLUPDATE CLONEID
438\par Value
4390x0080
440\par Description
441If set, indicates that the cloneId field is valid.
442
443\par Name
444VLUPDATE REPS DELETE
445\par Value
4460x0100
447\par Description
448Is the replica being deleted?
449
450\par Name
451VLUPDATE REPS ADD
452\par Value
4530x0200
454\par Description
455Is the replica being added?
456
457\par Name
458VLUPDATE REPS MODSERV
459\par Value
4600x0400
461\par Description
462Is the server part of the replica location correct?
463
464\par Name
465VLUPDATE REPS MODPART
466\par Value
4670x0800
468\par Description
469Is the partition part of the replica location correct?
470
471\par Name
472VLUPDATE REPS MODFLAG
473\par Value
4740x1000
475\par Description
476Various modification flag values.
477
478 \subsection sec3-2-3 Section 3.2.3: List-By-Attribute Bits
479
480\par
481These constants define bit values for the Mask field in the struct
482VldbListByAttributes is to be used in a match. Specifically, setting these bits
483is equivalent to declaring that the corresponding field within an object of
484type struct VldbListByAttributes is set. For example, setting the VLLIST SERVER
485flag in Mask indicates that the server field contains a valid value.
486
487\par Name
488VLLIST SERVER
489\par Value
4900x1
491\par Description
492If set, indicates that the server field is valid.
493
494\par Name
495VLLIST PARTITION
496\par Value
4970x2
498\par Description
499If set, indicates that the partition field is valid.
500
501\par Name
502VLLIST VOLUMETYPE
503\par Value
5040x4
505\par Description
506If set, indicates that the volumetype field is valid.
507
508\par Name
509VLLIST VOLUMEID
510\par Value
5110x8
512\par Description
513If set, indicates that the volumeid field is valid.
514
515\par Name
516VLLIST FLAG
517\par Value
5180x10
519\par Description
520If set, indicates that that flag field is valid.
521
522 \subsection sec3-2-4 Section 3.2.4: Volume Type Indices
523
524\par
525These constants specify the order of entries in the volumeid array in an object
526of type struct vldbentry. They also identify the three different types of
527volumes in AFS.
528
529\par Name
530RWVOL
531\par Value
5320
533\par Description
534Read-write volume.
535
536\par Name
537ROVOL
538\par Value
5391
540\par Description
541Read-only volume.
542
543\par Name
544BACKVOL
545\par Value
5462
547\par Description
548Backup volume.
549
550 \subsection sec3-2-5 Section 3.2.5: States for struct vlentry
551
552\par
553The following constants appear in the flags field in objects of type struct
554vlentry. The first three values listed specify the state of the entry, while
555all the rest stamp the entry with the type of an ongoing volume operation, such
556as a move, clone, backup, deletion, and dump. These volume operations are the
557legal values to provide to the voloper parameter of the VL SetLock() interface
558routine.
559\par
560For convenience, the constant VLOP ALLOPERS is defined as the inclusive OR of
561the above values from VLOP MOVE through VLOP DUMP.
562
563\par Name
564VLFREE
565\par Value
5660x1
567\par Description
568Entry is in the free list.
569
570\par Name
571VLDELETED
572\par Value
5730x2
574\par Description
575Entry is soft-deleted.
576
577\par Name
578VLLOCKED
579\par Value
5800x4
581\par Description
582Advisory lock held on the entry.
583
584\par Name
585VLOP MOVE
586\par Value
5870x10
588\par Description
589The associated volume is being moved between servers.
590
591\par Name
592VLOP RELEASE
593\par Value
5940x20
595\par Description
596The associated volume is being cloned to its replication sites.
597
598\par Name
599VLOP BACKUP
600\par Value
6010x40
602\par Description
603A backup volume is being created for the associated volume.
604
605\par Name
606VLOP DELETE
607\par Value
6080x80
609\par Description
610The associated volume is being deleted.
611
612\par Name
613VLOP DUMP
614\par Value
6150x100
616\par Description
617A dump is being taken of the associated volume.
618
619 \subsection sec3-2-6 Section 3.2.6: States for struct vldbentry
620
621\par
622Of the following constants, the first three appear in the flags field within an
623object of type struct vldbentry, advising of the existence of the basic volume
624types for the given volume, and hence the validity of the entries in the
625volumeId array field. The rest of the values provided in this table appear in
626the serverFlags array field, and apply to the instances of the volume appearing
627in the various replication sites.
628\par
629This structure appears in numerous Volume Location Server interface calls,
630namely VL CreateEntry(), VL GetEntryByID(), VL GetEntryByName(), VL
631ReplaceEntry() and VL ListEntry().
632
633\par Name
634VLF RWEXISTS
635\par Value
6360x1000
637\par Description
638The read-write volume ID is valid.
639
640\par Name
641VLF ROEXISTS
642\par Value
6430x2000
644\par Description
645The read-only volume ID is valid.
646
647\par Name
648VLF BACKEXISTS
649\par Value
6500x4000
651\par Description
652The backup volume ID is valid.
653
654\par Name
655VLSF NEWREPSITE
656\par Value
6570x01
658\par Description
659Not used; originally intended to mark an entry as belonging to a
660partially-created volume instance.
661
662\par Name
663VLSF ROVOL
664\par Value
6650x02
666\par Description
667A read-only version of the volume appears at this server.
668
669\par Name
670VLSF RWVOL
671\par Value
6720x02
673\par Description
674A read-write version of the volume appears at this server.
675
676\par Name
677VLSF BACKVOL
678\par Value
6790x08
680\par Description
681A backup version of the volume appears at this server.
682
683 \subsection sec3-2-7 Section 3.2.7: ReleaseType Argument Values
684
685\par
686The following values are used in the ReleaseType argument to various Volume
687Location Server interface routines, namely VL ReplaceEntry(), VL UpdateEntry()
688and VL ReleaseLock().
689
690\par Name
691LOCKREL TIMESTAMP
692\par Value
6931
694\par Description
695Is the LockTimestamp field valid?
696
697\par Name
698LOCKREL OPCODE
699\par Value
7002
701\par Description
702Are any of the bits valid in the flags field?
703
704\par Name
705LOCKREL AFSID
706\par Value
7074
708\par Description
709Is the LockAfsId field valid?
710
711 \subsection sec3-2-8 Section 3.2.8: Miscellaneous
712
713\par
714Miscellaneous values.
715\par Name
716VLREPSITE NEW
717\par Value
7181
719\par Description
720Has a replication site gotten a new release of a volume?
721\par
722A synonym for this constant is VLSF NEWREPSITE.
723
724 \section sec3-3 Section 3.3: Structures and Typedefs
725
726\par
727This section describes the major exported Volume Location Server data
728structures of interest to application programmers, along with the typedefs
729based upon those structures.
730
731 \subsection sec3-3-1 Section 3.3.1: struct vldbentry
732
733\par
734This structure represents an entry in the VLDB as made visible to Volume
735Location Server clients. It appears in numerous Volume Location Server
736interface calls, namely VL CreateEntry(), VL GetEntryByID(), VL
737GetEntryByName(), VL ReplaceEntry() and VL ListEntry().
738\n \b Fields
739\li char name[] - The string name for the volume, with a maximum length of
740MAXNAMELEN (65) characters, including the trailing null.
741\li long volumeType - The volume type, one of RWVOL, ROVOL, or BACKVOL.
742\li long nServers - The number of servers that have an instance of this volume.
743\li long serverNumber[] - An array of indices into the table of servers,
744identifying the sites holding an instance of this volume. There are at most
745MAXNSERVERS (8) of these server sites allowed by the Volume Location Server.
746\li long serverPartition[] - An array of partition identifiers, corresponding
747directly to the serverNumber array, specifying the partition on which each of
748those volume instances is located. As with the serverNumber array,
749serverPartition has up to MAXNSERVERS (8) entries.
750\li long serverFlags[] - This array holds one flag value for each of the
751servers in the previous arrays. Again, there are MAXNSERVERS (8) slots in this
752array.
753\li u long volumeId[] - An array of volume IDs, one for each volume type. There
754are MAXTYPES slots in this array.
755\li long cloneId - This field is used during a cloning operation.
756\li long flags - Flags concerning the status of the fields within this
757structure; see Section 3.2.6 for the bit values that apply.
758
759 \subsection sec3-3-2 Section 3.3.2: struct vlentry
760
761\par
762This structure is used internally by the Volume Location Server to fully
763represent a VLDB entry. The client-visible struct vldbentry represents merely a
764subset of the information contained herein.
765\n \b Fields
766\li u long volumeId[] - An array of volume IDs, one for each of the MAXTYPES of
767volume types.
768\li long flags - Flags concerning the status of the fields within this
769structure; see Section 3.2.6 for the bit values that apply.
770\li long LockAfsId - The individual who locked the entry. This feature has not
771yet been implemented.
772\li long LockTimestamp - Time stamp on the entry lock.
773\li long cloneId - This field is used during a cloning operation.
774\li long AssociatedChain - Pointer to the linked list of associated VLDB
775entries.
776\li long nextIdHash[] - Array of MAXTYPES next pointers for the ID hash table
777pointer, one for each related volume ID.
778\li long nextNameHash - Next pointer for the volume name hash table.
779\li long spares1[] - Two longword spare fields.
780\li char name[] - The volume's string name, with a maximum of MAXNAMELEN (65)
781characters, including the trailing null.
782\li u char volumeType - The volume's type, one of RWVOL, ROVOL, or BACKVOL.
783\li u char serverNumber[] - An array of indices into the table of servers,
784identifying the sites holding an instance of this volume. There are at most
785MAXNSERVERS (8) of these server sites allowed by the Volume Location Server.
786\li u char serverPartition[] - An array of partition identifiers, corresponding
787directly to the serverNumber array, specifying the partition on which each of
788those volume instances is located. As with the serverNumber array,
789serverPartition has up to MAXNSERVERS (8) entries.
790\li u char serverFlags[] - This array holds one flag value for each of the
791servers in the previous arrays. Again, there are MAXNSERVERS (8) slots in this
792array.
793\li u char RefCount - Only valid for read-write volumes, this field serves as a
794reference count, basically the number of dependent children volumes.
795\li char spares2[] - This field is used for 32-bit alignment.
796
797 \subsection sec3-3-3 Section 3.3.3: struct vital vlheader
798
799\par
800This structure defines the leading section of the VLDB header, of type struct
801vlheader. It contains frequently-used global variables and general statistics
802information.
803\n \b Fields
804\li long vldbversion - The VLDB version number. This field must appear first in
805the structure.
806\li long headersize - The total number of bytes in the header.
807\li long freePtr - Pointer to the first free enry in the free list, if any.
808\li long eofPtr - Pointer to the first free byte in the header file.
809\li long allocs - The total number of calls to the internal AllocBlock()
810function directed at this file.
811\li long frees - The total number of calls to the internal FreeBlock() function
812directed at this file.
813\li long MaxVolumeId - The largest volume ID ever granted for this cell.
814\li long totalEntries[] - The total number of VLDB entries by volume type in
815the VLDB. This array has MAXTYPES slots, one for each volume type.
816
817 \subsection sec3-3-4 Section 3.3.4: struct vlheader
818
819\par
820This is the layout of the information stored in the VLDB header. Notice it
821includes an object of type struct vital vlheader described above (see Section
8223.3.3) as the first field.
823\n \b Fields
824\li struct vital vlheader vital header - Holds critical VLDB header
825information.
826\li u long IpMappedAddr[] - Keeps MAXSERVERID+1 mappings of IP addresses to
827relative ones.
828\li long VolnameHash[] - The volume name hash table, with HASHSIZE slots.
829\li long VolidHash[][] - The volume ID hash table. The first dimension in this
830array selects which of the MAXTYPES volume types is desired, and the second
831dimension actually implements the HASHSIZE hash table buckets for the given
832volume type.
833
834 \subsection sec3-3-5 Section 3.3.5: struct VldbUpdateEntry
835
836\par
837This structure is used as an argument to the VL UpdateEntry() routine (see
838Section 3.6.7). Please note that multiple entries can be updated at once by
839setting the appropriate Mask bits. The bit values for this purpose are defined
840in Section 3.2.2.
841\n \b Fields
842\li u long Mask - Bit values determining which fields are to be affected by the
843update operation.
844\li char name[] - The volume name, up to MAXNAMELEN (65) characters including
845the trailing null.
846\li long volumeType - The volume type.
847\li long flags - This field is used in conjuction with Mask (in fact, one of
848the Mask bits determines if this field is valid) to choose the valid fields in
849this record.
850\li u long ReadOnlyId - The read-only ID.
851\li u long BackupId - The backup ID.
852\li long cloneId - The clone ID.
853\li long nModifiedRepsites - Number of replication sites whose entry is to be
854changed as below.
855\li u long RepsitesMask[] - Array of bit masks applying to the up to
856MAXNSERVERS (8) replication sites involved.
857\li long RepsitesTargetServer[] - Array of target servers for the operation, at
858most MAXNSERVERS (8) of them.
859\li long RepsitesTargetPart[] - Array of target server partitions for the
860operation, at most MAXNSERVERS (8) of them.
861\li long RepsitesNewServer[] - Array of new server sites, at most MAXNSERVERS
862(8) of them.
863\li long RepsitesNewPart[] - Array of new server partitions for the operation,
864at most MAXNSERVERS (8) of them.
865\li long RepsitesNewFlags[] - Flags applying to each of the new sites, at most
866MAXNSERVERS (8) of them.
867
868 \subsection sec3-3-6 Section 3.3.6: struct VldbListByAttributes
869
870\par
871This structure is used by the VL ListAttributes() routine (see Section 3.6.11).
872\n \b Fields
873\li u long Mask - Bit mask used to select the following attribute fields on
874which to match.
875\li long server - The server address to match.
876\li long partition - The partition ID to match.
877\li long volumetype - The volume type to match.
878\li long volumeid - The volume ID to match.
879\li long flag - Flags concerning these values.
880
881 \subsection sec3-3-7 Section 3.3.7: struct single vldbentry
882
883\par
884This structure is used to construct the vldblist object (See Section 3.3.12),
885which basically generates a queueable (singly-linked) version of struct
886vldbentry.
887\n \b Fields
888\li vldbentry VldbEntry - The VLDB entry to be queued.
889\li vldblist next vldb - The next pointer in the list.
890
891 \subsection sec3-3-8 Section 3.3.8: struct vldb list
892
893\par
894This structure defines the item returned in linked list form from the VL
895LinkedList() function (see Section 3.6.12). This same object is also returned
896in bulk form in calls to the VL ListAttributes() routine (see Section 3.6.11).
897\n \b Fields
898\li vldblist node - The body of the first object in the linked list.
899
900 \subsection sec3-3-9 Section 3.3.9: struct vldstats
901
902\par
903This structure defines fields to record statistics on opcode hit frequency. The
904MAX NUMBER OPCODES constant has been defined as the maximum number of opcodes
905supported by this structure, and is set to 30.
906\n \b Fields
907\li unsigned long start time - Clock time when opcode statistics were last
908cleared.
909\li long requests[] - Number of requests received for each of the MAX NUMBER
910OPCODES opcode types.
911\li long aborts[] - Number of aborts experienced for each of the MAX NUMBER
912OPCODES opcode types.
913\li long reserved[] - These five longword fields are reserved for future use.
914
915 \subsection sec3-3-10 Section 3.3.10: bulk
916
917\code
918typedef opaque bulk<DEFAULTBULK>;
919\endcode
920\par
921This typedef may be used to transfer an uninterpreted set of bytes across the
922Volume Location Server interface. It may carry up to DEFAULTBULK (10,000)
923bytes.
924\n \b Fields
925\li bulk len - The number of bytes contained within the data pointed to by the
926next field.
927\li bulk val - A pointer to a sequence of bulk len bytes.
928
929 \subsection sec3-3-11 Section 3.3.11: bulkentries
930
931\code
932typedef vldbentry bulkentries<>;
933\endcode
934\par
935This typedef is used to transfer an unbounded number of struct vldbentry
936objects. It appears in the parameter list for the VL ListAttributes() interface
937function.
938\n \b Fields
939\li bulkentries len - The number of vldbentry structures contained within the
940data pointed to by the next field.
941\li bulkentries val - A pointer to a sequence of bulkentries len vldbentry
942structures.
943
944 \subsection sec3-3-12 Section 3.3.12: vldblist
945
946\code
947typedef struct single_vldbentry *vldblist;
948\endcode
949\par
950This typedef defines a queueable struct vldbentry object, referenced by the
951single vldbentry typedef as well as struct vldb list.
952
953 \subsection sec3-3-13 Section 3.3.13: vlheader
954
955\code
956typedef struct vlheader vlheader;
957\endcode
958\par
959This typedef provides a short name for objects of type struct vlheader (see
960Section 3.3.4).
961
962 \subsection sec3-3-14 Section 3.3.14: vlentry
963
964\code
965typedef struct vlentry vlentry;
966\endcode
967\par
968This typedef provides a short name for objects of type struct vlentry (see
969Section 3.3.2).
970
971 \section sec3-4 Section 3.4: Error Codes
972
973\par
974This section covers the set of error codes exported by the Volume Location
975Server, displaying the printable phrases with which they are associated.
976
977\par Name
978VL IDEXIST
979\par Value
980(363520L)
981\par Description
982Volume Id entry exists in vl database.
983
984\par Name
985VL IO
986\par Value
987(363521L)
988\par Description
989I/O related error.
990
991\par Name
992VL NAMEEXIST
993\par Value
994(363522L)
995\par Description
996Volume name entry exists in vl database.
997
998\par Name
999VL CREATEFAIL
1000\par Value
1001(363523L)
1002\par Description
1003Internal creation failure.
1004
1005\par Name
1006VL NOENT
1007\par Value
1008(363524L)
1009\par Description
1010No such entry.
1011
1012\par Name
1013VL EMPTY
1014\par Value
1015(363525L)
1016\par Description
1017Vl database is empty.
1018
1019\par Name
1020VL ENTDELETED
1021\par Value
1022(363526L)
1023\par Description
1024Entry is deleted (soft delete).
1025
1026\par Name
1027VL BADNAME
1028\par Value
1029(363527L)
1030\par Description
1031Volume name is illegal.
1032
1033\par Name
1034VL BADINDEX
1035\par Value
1036(363528L)
1037\par Description
1038Index is out of range.
1039
1040\par Name
1041VL BADVOLTYPE
1042\par Value
1043(363529L)
1044\par Description
1045Bad volume range.
1046
1047\par Name
1048VL BADSERVER
1049\par Value
1050(363530L)
1051\par Description
1052Illegal server number (out of range).
1053
1054\par Name
1055VL BADPARTITION
1056\par Value
1057(363531L)
1058\par Description
1059Bad partition number.
1060
1061\par Name
1062VL REPSFULL
1063\par Value
1064(363532L)
1065\par Description
1066Run out of space for Replication sites.
1067
1068\par Name
1069VL NOREPSERVER
1070\par Value
1071(363533L)
1072\par Description
1073No such Replication server site exists.
1074
1075\par Name
1076VL DUPREPSERVER
1077\par Value
1078(363534L)
1079\par Description
1080Replication site already exists.
1081
1082\par Name
1083RL RWNOTFOUND
1084\par Value
1085(363535L)
1086\par Description
1087Parent R/W entry not found.
1088
1089\par Name
1090VL BADREFCOUNT
1091\par Value
1092(363536L)
1093\par Description
1094Illegal Reference Count number.
1095
1096\par Name
1097VL SIZEEXCEEDED
1098\par Value
1099(363537L)
1100\par Description
1101Vl size for attributes exceeded.
1102
1103\par Name
1104VL BADENTRY
1105\par Value
1106(363538L)
1107\par Description
1108Bad incoming vl entry.
1109
1110\par Name
1111VL BADVOLIDBUMP
1112\par Value
1113(363539L)
1114\par Description
1115Illegal max volid increment.
1116
1117\par Name
1118VL IDALREADYHASHED
1119\par Value
1120(363540L)
1121\par Description
1122RO/BACK id already hashed.
1123
1124\par Name
1125VL ENTRYLOCKED
1126\par Value
1127(363541L)
1128\par Description
1129Vl entry is already locked.
1130
1131\par Name
1132VL BADVOLOPER
1133\par Value
1134(363542L)
1135\par Description
1136Bad volume operation code.
1137
1138\par Name
1139VL BADRELLOCKTYPE
1140\par Value
1141(363543L)
1142\par Description
1143Bad release lock type.
1144
1145\par Name
1146VL RERELEASE
1147\par Value
1148(363544L)
1149\par Description
1150Status report: last release was aborted.
1151
1152\par Name
1153VL BADSERVERFLAG
1154\par Value
1155(363545L)
1156\par Description
1157Invalid replication site server flag.
1158
1159\par Name
1160VL PERM
1161\par Value
1162(363546L)
1163\par Description
1164No permission access.
1165
1166\par Name
1167VL NOMEM
1168\par Value
1169(363547L)
1170\par Description
1171malloc(realloc) failed to alloc enough memory.
1172
1173 \section sec3-5 Section 3.5: Macros
1174
1175\par
1176The Volume Location Server defines a small number of macros, as described in
1177this section. They are used to update the internal statistics variables and to
1178compute offsets into character strings. All of these macros really refer to
1179internal operations, and strictly speaking should not be exposed in this
1180interface.
1181
1182 \subsection sec3-5-1 Section 3.5.1: COUNT REQ()
1183
1184\code
1185#define COUNT_REQ(op)
1186static int this_op = op-VL_LOWEST_OPCODE;
1187dynamic_statistics.requests[this_op]++
1188\endcode
1189\par
1190Bump the appropriate entry in the variable maintaining opcode usage statistics
1191for the Volume Location Server. Note that a static variable is set up to record
1192this op, namely the index into the opcode monitoring array. This static
1193variable is used by the related COUNT ABO() macro defined below.
1194
1195 \subsection sec3-5-2 Section 3.5.2: COUNT ABO()
1196
1197\code
1198#define COUNT_ABO dynamic_statistics.aborts[this_op]++
1199\endcode
1200\par
1201Bump the appropriate entry in the variable maintaining opcode abort statistics
1202for the Volume Location Server. Note that this macro does not take any
1203arguemnts. It expects to find a this op variable in its environment, and thus
1204depends on its related macro, COUNT REQ() to define that variable.
1205
1206 \subsection sec3-5-3 Section 3.5.3: DOFFSET()
1207
1208\code
1209#define DOFFSET(abase, astr, aitem) ((abase)+(((char *)(aitem)) -((char
1210*)(astr))))
1211\endcode
1212\par
1213Compute the byte offset of charcter object aitem within the enclosing object
1214astr, also expressed as a character-based object, then offset the resulting
1215address by abase. This macro is used ot compute locations within the VLDB when
1216actually writing out information.
1217
1218 \section sec3-6 Section 3.6: Functions
1219
1220\par
1221This section covers the Volume Location Server RPC interface routines. The
1222majority of them are generated from the vldbint.xg Rxgen file, and are meant to
1223be used by user-space agents. There is also a subset interface definition
1224provided in the afsvlint.xg Rxgen file. These routines, described in Section
12253.7, are meant to be used by a kernel-space agent when dealing with the Volume
1226Location Server; in particular, they are called by the Cache Manager.
1227
1228 \subsection sec3-6-1 Section 3.6.1: VL CreateEntry - Create a VLDB
1229entry
1230
1231\code
1232int VL CreateEntry(IN struct rx connection *z conn,
1233 IN vldbentry *newentry)
1234\endcode
1235\par Description
1236This function creates a new entry in the VLDB, as specified in the newentry
1237argument. Both the name and numerical ID of the new volume must be unique
1238(e.g., it must not already appear in the VLDB). For non-read-write entries, the
1239read-write parent volume is accessed so that its reference count can be
1240updated, and the new entry is added to the parent's chain of associated
1241entries.
1242The VLDB is write-locked for the duration of this operation.
1243\par Error Codes
1244VL PERM The caller is not authorized to execute this function. VL NAMEEXIST The
1245volume name already appears in the VLDB. VL CREATEFAIL Space for the new entry
1246cannot be allocated within the VLDB. VL BADNAME The volume name is invalid. VL
1247BADVOLTYPE The volume type is invalid. VL BADSERVER The indicated server
1248information is invalid. VL BADPARTITION The indicated partition information is
1249invalid. VL BADSERVERFLAG The server flag field is invalid. VL IO An error
1250occurred while writing to the VLDB.
1251
1252 \subsection sec3-6-2 Section 3.6.2: VL DeleteEntry - Delete a VLDB
1253entry
1254
1255\code
1256int VL DeleteEntry(IN struct rx connection *z conn,
1257 IN long Volid,
1258 IN long voltype)
1259\endcode
1260\par Description
1261Delete the entry matching the given volume identifier and volume type as
1262specified in the Volid and voltype arguments. For a read-write entry whose
1263reference count is greater than 1, the entry is not actually deleted, since at
1264least one child (read-only or backup) volume still depends on it. For cases of
1265non-read-write volumes, the parent's reference count and associated chains are
1266updated.
1267\par
1268If the associated VLDB entry is already marked as deleted (i.e., its flags
1269field has the VLDELETED bit set), then no further action is taken, and VL
1270ENTDELETED is returned. The VLDB is write-locked for the duration of this
1271operation.
1272\par Error Codes
1273VL PERM The caller is not authorized to execute this function. VL BADVOLTYPE An
1274illegal volume type has been specified by the voltype argument. VL NOENT This
1275volume instance does not appear in the VLDB. VL ENTDELETED The given VLDB entry
1276has already been marked as deleted. VL IO An error occurred while writing to
1277the VLDB.
1278
1279 \subsection sec3-6-3 Section 3.6.3: VL GetEntryByID - Get VLDB entry by
1280volume ID/type
1281
1282\code
1283int VL GetEntryByID(IN struct rx connection *z conn, IN long Volid, IN long
1284voltype, OUT vldbentry *entry)
1285\endcode
1286\par Description
1287Given a volume's numerical identifier (Volid) and type (voltype), return a
1288pointer to the entry in the VLDB describing the given volume instance.
1289\par
1290The VLDB is read-locked for the duration of this operation.
1291\par Error Codes
1292VL BADVOLTYPE An illegal volume type has been specified by the voltype
1293argument.
1294\n VL NOENT This volume instance does not appear in the VLDB.
1295\n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1296
1297 \subsection sec3-6-4 Section 3.6.4: VL GetEntryByName - Get VLDB entry
1298by volume name
1299
1300\code
1301int VL GetEntryByName(IN struct rx connection *z conn,
1302 IN char *volumename,
1303 OUT vldbentry *entry)
1304\endcode
1305\par Description
1306Given the volume name in the volumename parameter, return a pointer to the
1307entry in the VLDB describing the given volume. The name in volumename may be no
1308longer than MAXNAMELEN (65) characters, including the trailing null. Note that
1309it is legal to use the volume's numerical identifier (in string form) as the
1310volume name.
1311\par
1312The VLDB is read-locked for the duration of this operation.
1313\par
1314This function is closely related to the VL GetEntryByID() routine, as might be
1315expected. In fact, the by-ID routine is called if the volume name provided in
1316volumename is the string version of the volume's numerical identifier.
1317\par Error Codes
1318VL BADVOLTYPE An illegal volume type has been specified by the voltype
1319argument.
1320\n VL NOENT This volume instance does not appear in the VLDB.
1321\n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1322\n VL BADNAME The volume name is invalid.
1323
1324 \subsection sec3-6-5 Section 3.6.5: VL GetNewVolumeId - Generate a new
1325volume ID
1326
1327\code
1328int VL GetNewVolumeId(IN struct rx connection *z conn,
1329 IN long bumpcount,
1330 OUT long *newvolumid)
1331\endcode
1332\par Description
1333Acquire bumpcount unused, consecutively-numbered volume identifiers from the
1334Volume Location Server. The lowest-numbered of the newly-acquired set is placed
1335in the newvolumid argument. The largest number of volume IDs that may be
1336generated with any one call is bounded by the MAXBUMPCOUNT constant defined in
1337Section 3.2.1. Currently, there is (effectively) no restriction on the number
1338of volume identifiers that may thus be reserved in a single call.
1339\par
1340The VLDB is write-locked for the duration of this operation.
1341\par Error Codes
1342VL PERM The caller is not authorized to execute this function.
1343\n VL BADVOLIDBUMP The value of the bumpcount parameter exceeds the system
1344limit of MAXBUMPCOUNT.
1345\n VL IO An error occurred while writing to the VLDB.
1346
1347 \subsection sec3-6-6 Section 3.6.6: VL ReplaceEntry - Replace entire
1348contents of VLDB entry
1349
1350\code
1351int VL ReplaceEntry(IN struct rx connection *z conn,
1352 IN long Volid,
1353 IN long voltype,
1354 IN vldbentry *newentry,
1355 IN long ReleaseType)
1356\endcode
1357\par Description
1358Perform a wholesale replacement of the VLDB entry corresponding to the volume
1359instance whose identifier is Volid and type voltype with the information
1360contained in the newentry argument. Individual VLDB entry fields cannot be
1361selectively changed while the others are preserved; VL UpdateEntry() should be
1362used for this objective. The permissible values for the ReleaseType parameter
1363are defined in Section 3.2.7.
1364\par
1365The VLDB is write-locked for the duration of this operation. All of the hash
1366tables impacted are brought up to date to incorporate the new information.
1367\par Error Codes
1368VL PERM The caller is not authorized to execute this function.
1369\n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1370argument.
1371\n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1372ReleaseType argument.
1373\n VL NOENT This volume instance does not appear in the VLDB.
1374\n VL BADENTRY An attempt was made to change a read-write volume ID.
1375\n VL IO An error occurred while writing to the VLDB.
1376
1377 \subsection sec3-6-7 Section 3.6.7: VL UpdateEntry - Update contents of
1378VLDB entry
1379
1380\code
1381int VL UpdateEntry(IN struct rx connection *z conn,
1382 IN long Volid,
1383 IN long voltype,
1384 IN VldbUpdateEntry *UpdateEntry,
1385 IN long ReleaseType)
1386\endcode
1387\par Description
1388Update the VLDB entry corresponding to the volume instance whose identifier is
1389Volid and type voltype with the information contained in the UpdateEntry
1390argument. Most of the entry's fields can be modified in a single call to VL
1391UpdateEntry(). The Mask field within the UpdateEntry parameter selects the
1392fields to update with the values stored within the other UpdateEntry fields.
1393Permissible values for the ReleaseType parameter are defined in Section 3.2.7.
1394\par
1395The VLDB is write-locked for the duration of this operation.
1396\par Error Codes
1397VL PERM The caller is not authorized to execute this function.
1398\n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1399argument.
1400\n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1401ReleaseType argument.
1402\n VL NOENT This volume instance does not appear in the VLDB.
1403\n VL IO An error occurred while writing to the VLDB.
1404
1405 \subsection sec3-6-8 Section 3.6.8: VL SetLock - Lock VLDB entry
1406
1407\code
1408int VL SetLock(IN struct rx connection *z conn,
1409 IN long Volid,
1410 IN long voltype,
1411 IN long voloper)
1412\endcode
1413\par Description
1414Lock the VLDB entry matching the given volume ID (Volid) and type (voltype) for
1415volume operation voloper (e.g., VLOP MOVE and VLOP RELEASE). If the entry is
1416currently unlocked, then its LockTimestamp will be zero. If the lock is
1417obtained, the given voloper is stamped into the flags field, and the
1418LockTimestamp is set to the time of the call.
1419\Note
1420When the caller attempts to lock the entry for a release operation, special
1421care is taken to abort the operation if the entry has already been locked for
1422this operation, and the existing lock has timed out. In this case, VL SetLock()
1423returns VL RERELEASE.
1424\par
1425The VLDB is write-locked for the duration of this operation.
1426\par Error Codes
1427VL PERM The caller is not authorized to execute this function.
1428\n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1429argument.
1430\n VL BADVOLOPER An illegal volume operation was specified in the voloper
1431argument. Legal values are defined in the latter part of the table in Section
14323.2.5.
1433\n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1434\n VL ENTRYLOCKED The given VLDB entry has already been locked (which has not
1435yet timed out).
1436\n VL RERELEASE A VLDB entry locked for release has timed out, and the caller
1437also wanted to perform a release operation on it.
1438\n VL IO An error was experienced while attempting to write to the VLDB.
1439
1440 \subsection sec3-6-9 Section 3.6.9: VL ReleaseLock - Unlock VLDB entry
1441
1442\code
1443int VL ReleaseLock(IN struct rx connection *z conn,
1444 IN long Volid,
1445 IN long voltype,
1446 IN long ReleaseType)
1447\endcode
1448\par Description
1449Unlock the VLDB entry matching the given volume ID (Volid) and type (voltype).
1450The ReleaseType argument determines which VLDB entry fields from flags and
1451LockAfsId will be cleared along with the lock timestamp in LockTimestamp.
1452Permissible values for the ReleaseType parameter are defined in Section 3.2.7.
1453\par
1454The VLDB is write-locked for the duration of this operation.
1455\par Error Codes
1456VL PERM The caller is not authorized to execute this function.
1457\n VL BADVOLTYPE An illegal volume type has been specified by the voltype
1458argument.
1459\n VL BADRELLOCKTYPE An illegal release lock has been specified by the
1460ReleaseType argument.
1461\n VL NOENT This volume instance does not appear in the VLDB.
1462\n VL ENTDELETED The given VLDB entry has already been marked as deleted.
1463\n VL IO An error was experienced while attempting to write to the VLDB.
1464
1465 \subsection sec3-6-10 Section 3.6.10: VL ListEntry - Get contents of
1466VLDB via index
1467
1468\code
1469int VL ListEntry(IN struct rx connection *z conn,
1470 IN long previous index,
1471 OUT long *count,
1472 OUT long *next index,
1473 OUT vldbentry *entry)
1474\endcode
1475\par Description
1476This function assists in the task of enumerating the contents of the VLDB.
1477Given an index into the database, previous index, this call return the single
1478VLDB entry at that offset, placing it in the entry argument. The number of VLDB
1479entries left to list is placed in count, and the index of the next entry to
1480request is returned in next index. If an illegal index is provided, count is
1481set to -1.
1482\par
1483The VLDB is read-locked for the duration of this operation.
1484\par Error Codes
1485---None.
1486
1487 \subsection sec3-6-11 Section 3.6.11: VL ListAttributes - List all VLDB
1488entry matching given attributes, single return object
1489
1490\code
1491int VL ListAttributes(IN struct rx connection *z conn,
1492 IN VldbListByAttributes *attributes,
1493 OUT long *nentries,
1494 OUT bulkentries *blkentries)
1495\endcode
1496\par Description
1497Retrieve all the VLDB entries that match the attributes listed in the
1498attributes parameter, placing them in the blkentries object. The number of
1499matching entries is placed in nentries. Matching can be done by server number,
1500partition, volume type, flag, or volume ID. The legal values to use in the
1501attributes argument are listed in Section 3.2.3. Note that if the VLLIST
1502VOLUMEID bit is set in attributes, all other bit values are ignored and the
1503volume ID provided is the sole search criterion.
1504\par
1505The VLDB is read-locked for the duration of this operation.
1506\par
1507Note that VL ListAttributes() is a potentially expensive function, as
1508sequential search through all of the VLDB entries is performed in most cases.
1509\par Error Codes
1510VL NOMEM Memory for the blkentries object could not be allocated.
1511\n VL NOENT This specified volume instance does not appear in the VLDB.
1512\n VL SIZEEXCEEDED Ran out of room in the blkentries object.
1513\n VL IO Error while reading from the VLDB.
1514
1515 \subsection sec3-6-12 Section 3.6.12: VL LinkedList - List all VLDB
1516entry matching given attributes, linked list return object
1517
1518\code
1519int VL LinkedList(IN struct rx connection *z conn,
1520 IN VldbListByAttributes *attributes,
1521 OUT long *nentries,
1522 OUT vldb list *linkedentries)
1523\endcode
1524\par Description
1525Retrieve all the VLDB entries that match the attributes listed in the
1526attributes parameter, creating a linked list of entries based in the
1527linkedentries object. The number of matching entries is placed in nentries.
1528Matching can be done by server number, partition, volume type, flag, or volume
1529ID. The legal values to use in the attributes argument are listed in Section
15303.2.3. Note that if the VLLIST VOLUMEID bit is set in attributes, all other bit
1531values are ignored and the volume ID provided is the sole search criterion.
1532\par
1533The VL LinkedList() function is identical to the VL ListAttributes(), except
1534for the method of delivering the VLDB entries to the caller.
1535\par
1536The VLDB is read-locked for the duration of this operation.
1537\par Error Codes
1538VL NOMEM Memory for an entry in the list based at linkedentries object could
1539not be allocated.
1540\n VL NOENT This specified volume instance does not appear in the VLDB.
1541\n VL SIZEEXCEEDED Ran out of room in the current list object.
1542\n VL IO Error while reading from the VLDB.
1543
1544 \subsection sec3-6-13 Section 3.6.13: VL GetStats - Get Volume Location
1545Server statistics
1546
1547\code
1548int VL GetStats(IN struct rx connection *z conn,
1549 OUT vldstats *stats,
1550 OUT vital vlheader *vital header)
1551\endcode
1552\par Description
1553Collect the different types of VLDB statistics. Part of the VLDB header is
1554returned in vital header, which includes such information as the number of
1555allocations and frees performed, and the next volume ID to be allocated. The
1556dynamic per-operation stats are returned in the stats argument, reporting the
1557number and types of operations and aborts.
1558\par
1559The VLDB is read-locked for the duration of this operation.
1560\par Error Codes
1561VL PERM The caller is not authorized to execute this function.
1562
1563 \subsection sec3-6-14 Section 3.6.14: VL Probe - Verify Volume Location
1564Server connectivity/status
1565
1566\code
1567int VL Probe(IN struct rx connection *z conn)
1568\endcode
1569\par Description
1570This routine serves a 'pinging' function to determine whether the Volume
1571Location Server is still running. If this call succeeds, then the Volume
1572Location Server is shown to be capable of responding to RPCs, thus confirming
1573connectivity and basic operation.
1574\par
1575The VLDB is not locked for this operation.
1576\par Error Codes
1577---None.
1578
1579 \section sec3-7 Section 3.7: Kernel Interface Subset
1580
1581\par
1582The interface described by this document so far applies to user-level clients,
1583such as the vos utility. However, some volume location operations must be
1584performed from within the kernel. Specifically, the Cache Manager must find out
1585where volumes reside and otherwise gather information about them in order to
1586conduct its business with the File Servers holding them. In order to support
1587Volume Location Server interconnection for agents operating within the kernel,
1588the afsvlint.xg Rxgen interface was built. It is a minimal subset of the
1589user-level vldbint.xg definition. Within afsvlint.xg, there are duplicate
1590definitions for such constants as MAXNAMELEN, MAXNSERVERS, MAXTYPES, VLF
1591RWEXISTS, VLF ROEXISTS, VLF BACKEXISTS, VLSF NEWREPSITE, VLSF ROVOL, VLSF
1592RWVOL, and VLSF BACKVOL. Since the only operations the Cache Manager must
1593perform are volume location given a specific volume ID or name, and to find out
1594about unresponsive Volume Location Servers, the following interface routines
1595are duplicated in afsvlint.xg, along with the struct vldbentry declaration:
1596\li VL GetEntryByID()
1597\li VL GetEntryByName()
1598\li VL Probe()
1599
1600 \page chap4 Chapter 4: Volume Server Architecture
1601
1602 \section sec4-1 Section 4.1: Introduction
1603
1604\par
1605The Volume Server allows administrative tasks and probes to be performed on the
1606set of AFS volumes residing on the machine on which it is running. As described
1607in Chapter 2, a distributed database holding volume location info, the VLDB, is
1608used by client applications to locate these volumes. Volume Server functions
1609are typically invoked either directly from authorized users via the vos utility
1610or by the AFS backup system.
1611\par
1612This chapter briefly discusses various aspects of the Volume Server's
1613architecture. First, the high-level on-disk representation of volumes is
1614covered. Then, the transactions used in conjuction with volume operations are
1615examined. Then, the program implementing the Volume Server, volserver, is
1616considered. The nature and format of the log file kept by the Volume Server
1617rounds out the description.
1618As with all AFS servers, the Volume Server uses the Rx remote procedure call
1619package for communication with its clients.
1620
1621 \section sec4-2 Section 4.2: Disk Representation
1622
1623\par
1624For each volume on an AFS partition, there exists a file visible in the unix
1625name space which describes the contents of that volume. By convention, each of
1626these files is named by concatenating a prefix string, "V", the numerical
1627volume ID, and the postfix string ".vol". Thus, file V0536870918.vol describes
1628the volume whose numerical ID is 0536870918. Internally, each per-volume
1629descriptor file has such fields as a version number, the numerical volume ID,
1630and the numerical parent ID (useful for read-only or backup volumes). It also
1631has a list of related inodes, namely files which are not visible from the unix
1632name space (i.e., they do not appear as entries in any unix directory object).
1633The set of important related inodes are:
1634\li Volume info inode: This field identifies the inode which hosts the on-disk
1635representation of the volume's header. It is very similar to the information
1636pointed to by the volume field of the struct volser trans defined in Section
16375.4.1, recording important status information for the volume.
1638\li Large vnode index inode: This field identifies the inode which holds the
1639list of vnode identifiers for all directory objects residing within the volume.
1640These are "large" since they must also hold the Access Control List (ACL)
1641information for the given AFS directory.
1642\li Small vnode index inode: This field identifies the inode which holds the
1643list of vnode identifiers for all non-directory objects hosted by the volume.
1644\par
1645All of the actual files and directories residing within an AFS volume, as
1646identified by the contents of the large and small vnode index inodes, are also
1647free-floating inodes, not appearing in the conventional unix name space. This
1648is the reason the vendor-supplied fsck program should not be run on partitions
1649containing AFS volumes. Since the inodes making up AFS files and directories,
1650as well as the inodes serving as volume indices for them, are not mapped to any
1651directory, the standard fsck program would throw away all of these
1652"unreferenced" inodes. Thus, a special version of fsck is provided that
1653recognizes partitions containing AFS volumes as well as standard unix
1654partitions.
1655
1656 \section sec4-3 Section 4.3: Transactions
1657
1658\par
1659Each individual volume operation is carried out by the Volume Server as a
1660transaction, but not in the atomic sense of the word. Logically, creating a
1661Volume Server transaction can be equated with performing an "exclusive open" on
1662the given volume before beginning the actual work of the desired volume
1663operation. No other Volume Server (or File Server) operation is allowed on the
1664opened volume until the transaction is terminated. Thus, transactions in the
1665context of the Volume Server serve to provide mutual exclusion without any of
1666the normal atomicity guarantees. Volumes maintain enough internal state to
1667enable recovery from interrupted or failed operations via use of the salvager
1668program. Whenever volume inconsistencies are detected, this salvager program is
1669run, which then attempts to correct the problem.
1670\par
1671Volume transactions have timeouts associated with them. This guarantees that
1672the death of the agent performing a given volume operation cannot result in the
1673volume being permanently removed from circulation. There are actually two
1674timeout periods defined for a volume transaction. The first is the warning
1675time, defined to be 5 minutes. If a transaction lasts for more than this time
1676period without making progress, the Volume Server prints a warning message to
1677its log file (see Section 4.5). The second time value associated with a volume
1678transaction is the hard timeout, defined to occur 10 minutes after any progress
1679has been made on the given operation. After this period, the transaction will
1680be unconditionally deleted, and the volume freed for any other operations.
1681Transactions are reference-counted. Progress will be deemed to have occurred
1682for a transaction, and its internal timeclock field will be updated, when:
1683\li 1 The transaction is first created.
1684\li 2 A reference is made to the transaction, causing the Volume Server to look
1685it up in its internal tables.
1686\li 3 The transaction's reference count is decremented.
1687
1688 \section sec4-4 Section 4.4: The volserver Process
1689
1690\par
1691The volserver user-level program is run on every AFS server machine, and
1692implements the Volume Server agent. It is responsible for providing the Volume
1693Server interface as defined by the volint.xg Rxgen file.
1694\par
1695The volserver process defines and launches five threads to perform the bulk of
1696its duties. One thread implements a background daemon whose job it is to
1697garbage-collect timed-out transaction structures. The other four threads are
1698RPC interface listeners, primed to accept remote procedure calls and thus
1699perform the defined set of volume operations.
1700\par
1701Certain non-standard configuration settings are made for the RPC subsystem by
1702the volserver program. For example, it chooses to extend the length of time
1703that an Rx connection may remain idle from the default 12 seconds to 120
1704seconds. The reasoning here is that certain volume operations may take longer
1705than 12 seconds of processing time on the server, and thus the default setting
1706for the connection timeout value would incorrectly terminate an RPC when in
1707fact it was proceeding normally and correctly.
1708\par
1709The volserver program takes a single, optional command line argument. If a
1710positive integer value is provided on the command line, then it shall be used
1711to set the debugging level within the Volume Server. By default, a value of
1712zero is used, specifying that no special debugging output will be generated and
1713fed to the Volume Server log file described below.
1714
1715 \section sec4-5 Section 4.5: Log File
1716
1717\par
1718The Volume Server keeps a log file, recording the set of events of special
1719interest it has encountered. The file is named VolserLog, and is stored in the
1720/usr/afs/logs directory on the local disk of the server machine on which the
1721Volume Server runs. This is a human-readable file, with every entry
1722time-stamped.
1723\par
1724Whenever the volserver program restarts, it renames the current VolserLog file
1725to VolserLog.old, and starts up a fresh log. A properly-authorized individual
1726can easily inspect the log file residing on any given server machine. This is
1727made possible by the BOS Server AFS agent running on the machine, which allows
1728the contents of this file to be fetched and displayed on the caller's machine
1729via the bos getlog command.
1730\par
1731An excerpt from a Volume Server log file follows below. The numbers appearing
1732in square brackets at the beginning of each line have been inserted so that we
1733may reference the individual lines of the log excerpt in the following
1734paragraph.
1735\code
1736[1] Wed May 8 06:03:00 1991 AttachVolume: Error attaching volume
1737/vicepd/V1969547815.vol; volume needs salvage
1738[2] Wed May 8 06:03:01 1991 Volser: ListVolumes: Could not attach volume
17391969547815
1740[3] Wed May 8 07:36:13 1991 Volser: Clone: Cloning volume 1969541499 to new
1741volume 1969541501
1742[4] Wed May 8 11:25:05 1991 AttachVolume: Cannot read volume header
1743/vicepd/V1969547415.vol
1744[5] Wed May 8 11:25:06 1991 Volser: CreateVolume: volume 1969547415
1745(bld.dce.s3.dv.pmax_ul3) created
1746\endcode
1747\par
1748Line [1] indicates that the volume whose numerical ID is 1969547815 could not
1749be attached on partition /vicepd. This error is probably the result of an
1750aborted transaction which left the volume in an inconsistent state, or by
1751actual damage to the volume structure or data. In this case, the Volume Server
1752recommends that the salvager program be run on this volume to restore its
1753integrity. Line [2] records the operation which revealed this situation, namely
1754the invocation of an AFSVolListVolumes() RPC.
1755\par
1756Line [4] reveals that the volume header file for a specific volume could not be
1757read. Line [5], as with line [2] in the above paragraph, indicates why this is
1758true. Someone had called the AFSVolCreateVolume() interface function, and as a
1759precaution, the Volume Server first checked to see if such a volume was already
1760present by attempting to read its header.
1761\par
1762Thus verifying that the volume did not previously exist, the Volume Server
1763allowed the AFSVolCreateVolume() call to continue its processing, creating and
1764initializing the proper volume file, V1969547415.vol, and the associated header
1765and index inodes.
1766
1767 \page chap5 Chapter 5: Volume Server Interface
1768
1769 \section sec5-1 Section 5.1 Introduction
1770
1771\par
1772This chapter documents the API for the Volume Server facility, as defined by
1773the volint.xg Rxgen interface file and the volser.h include file. Descriptions
1774of all the constants, structures, macros, and interface functions available to
1775the application programmer appear here.
1776
1777 \section sec5-2 Section 5.2: Constants
1778
1779\par
1780This section covers the basic constant definitions of interest to the Volume
1781Server application programmer. These definitions appear in the volint.h file,
1782automatically generated from the volint.xg Rxgen interface file, and in
1783volser.h.
1784\par
1785Each subsection is devoted to describing the constants falling into the
1786following categories:
1787\li Configuration and boundary values
1788\li Interface routine opcodes
1789\li Transaction Flags
1790\li Volume Types
1791\li LWP State
1792\li States for struct vldbentry
1793\li Validity Checks
1794\li Miscellaneous
1795
1796 \subsection sec5-2-1 Section 5.2.1: Configuration and Boundary Values
1797
1798\par
1799These constants define some basic system configuration values, along with such
1800things as maximum sizes of important arrays.
1801
1802MyPort 5,003 The Rx UDP port on which the Volume Server service may be
1803found.
1804\par Name
1805NameLen
1806\par Value
180780
1808\par Description
1809Used by the vos utility to define maximum lengths for internal filename
1810variables.
1811
1812\par Name
1813VLDB MAXSERVERS
1814\par Value
181510
1816\par Description
1817Maximum number of server agents implementing the AFS Volume Location Database
1818(VLDB) for the cell.
1819
1820\par Name
1821VOLSERVICE ID
1822\par Value
18234
1824\par Description
1825The Rx service number on the given UDP port (MyPort) above.
1826
1827\par Name
1828INVALID BID
1829\par Value
18300
1831\par Description
1832Used as an invalid read-only or backup volume ID.
1833
1834\par Name
1835VOLSER MAXVOLNAME
1836\par Value
183765
1838\par Description
1839The number of characters in the longest possible volume name, including the
1840trailing null. Note: this is only used by the vos utility; the Volume Server
1841uses the "old" value below.
1842
1843\par Name
1844VOLSER OLDMAXVOLNAME
1845\par Value
184632
1847\par Description
1848The "old" maximum number of characters in an AFS volume name, including the
1849trailing null. In reality, it is also the current maximum.
1850
1851\par Name
1852VOLSER MAX REPSITES
1853\par Value
18547
1855\par Description
1856The maximum number of replication sites for a volume.
1857
1858\par Name
1859VNAMESIZE
1860\par Value
186132
1862\par Description
1863Size in bytes of the name field in struct volintInfo (see Section 5.4.6).
1864
1865
1866 \subsection sec5-2-2 Section 5.2.2: Interface Routine Opcodes
1867
1868\par
1869These constants, appearing in the volint.xg Rxgen interface file for the Volume
1870Server, define the opcodes for the RPC routines. Every Rx call on this
1871interface contains this opcode, and the dispatcher uses it to select the proper
1872code at the server site to carry out the call.
1873
1874\par Name
1875VOLCREATEVOLUME
1876\par Value
1877100
1878\par Description
1879Opcode for AFSVolCreateVolume()
1880
1881\par Name
1882VOLDELETEVOLUME
1883\par Value
1884101
1885\par Description
1886Opcode for AFSVolDeleteVolume()
1887
1888\par Name
1889VOLRESTORE
1890\par Value
1891102
1892\par Description
1893Opcode for AFSVolRestoreVolume()
1894
1895\par Name
1896VOLFORWARD
1897\par Value
1898103
1899\par Description
1900Opcode for AFSVolForward()
1901
1902\par Name
1903VOLENDTRANS
1904\par Value
1905104
1906\par Description
1907Opcode for AFSVolEndTrans()
1908
1909\par Name
1910VOLCLONE
1911\par Value
1912105
1913\par Description
1914Opcode for AFSVolClone() .
1915
1916\par Name
1917VOLSETFLAGS
1918\par Value
1919106
1920\par Description
1921Opcode for AFSVolSetFlags()
1922
1923\par Name
1924VOLGETFLAGS
1925\par Value
1926107
1927\par Description
1928Opcode for AFSVolGetFlags()
1929
1930\par Name
1931VOLTRANSCREATE
1932\par Value
1933108
1934\par Description
1935Opcode for AFSVolTransCreate()
1936
1937\par Name
1938VOLDUMP
1939\par Value
1940109
1941\par Description
1942Opcode for AFSVolDump()
1943
1944\par Name
1945VOLGETNTHVOLUME
1946\par Value
1947110
1948\par Description
1949Opcode for AFSVolGetNthVolume()
1950
1951\par Name
1952VOLSETFORWARDING
1953\par Value
1954111
1955\par Description
1956Opcode for AFSVolSetForwarding()
1957
1958\par Name
1959VOLGETNAME
1960\par Value
1961112
1962\par Description
1963Opcode for AFSVolGetName()
1964
1965\par Name
1966VOLGETSTATUS
1967\par Value
1968113
1969\par Description
1970Opcode for AFSVolGetStatus()
1971
1972\par Name
1973VOLSIGRESTORE
1974\par Value
1975114
1976\par Description
1977Opcode for AFSVolSignalRestore()
1978
1979\par Name
1980VOLLISTPARTITIONS
1981\par Value
1982115
1983\par Description
1984Opcode for AFSVolListPartitions()
1985
1986\par Name
1987VOLLISTVOLS
1988\par Value
1989116
1990\par Description
1991Opcode for AFSVolListVolumes()
1992
1993\par Name
1994VOLSETIDSTYPES
1995\par Value
1996117
1997\par Description
1998Opcode for AFSVolSetIdsTypes()
1999
2000\par Name
2001VOLMONITOR
2002\par Value
2003118
2004\par Description
2005Opcode for AFSVolMonitor()
2006
2007\par Name
2008VOLDISKPART
2009\par Value
2010119
2011\par Description
2012Opcode for AFSVolPartitionInfo()
2013
2014\par Name
2015VOLRECLONE
2016\par Value
2017120
2018\par Description
2019Opcode for AFSVolReClone()
2020
2021\par Name
2022VOLLISTONEVOL
2023\par Value
2024121
2025\par Description
2026Opcode for AFSVolListOneVolume()
2027
2028\par Name
2029VOLNUKE
2030\par Value
2031122
2032\par Description
2033Opcode for AFSVolNukeVolume()
2034
2035\par Name
2036VOLSETDATE
2037\par Value
2038123
2039\par Description
2040Opcode for AFSVolSetDate()
2041
2042 \subsection sec5-2-3 Section 5.2.3: Transaction Flags
2043
2044\par
2045These constants define the various flags the Volume Server uses in assocation
2046with volume transactions, keeping track of volumes upon which operations are
2047currently proceeding. There are three sets of flag values, stored in three
2048different fields within a struct volser trans: general volume state, attachment
2049modes, and specific transaction states.
2050
2051 \subsubsection sec5-2-3-1: Section 5.2.3.1 vflags
2052
2053\par
2054These values are used to represent the general state of the associated volume.
2055They appear in the vflags field within a struct volser trans.
2056
2057\par Name
2058VTDeleteOnSalvage
2059\par Value
20601
2061\par Description
2062The volume should be deleted on next salvage.
2063
2064\par Name
2065VTOutOfService
2066\par Value
20672
2068\par Description
2069This volume should never be put online.
2070
2071\par Name
2072VTDeleted
2073\par Value
20744
2075\par Description
2076