Import Upstream version 1.8.5
[hcoop/debian/openafs.git] / doc / txt / arch-overview.h
CommitLineData
805e021f
CE
1/*!
2 \addtogroup arch-overview Architectural Overview
3 \page title AFS-3 Programmer's Reference: Architectural Overview
4
5\author Edward R. Zayas
6Transarc Corporation
7\version 1.0
8\date 2 September 1991 22:53 .cCopyright 1991 Transarc Corporation All Rights
9Reserved FS-00-D160
10
11
12 \page chap1 Chapter 1: Introduction
13
14 \section sec1-1 Section 1.1: Goals and Background
15
16\par
17This paper provides an architectural overview of Transarc's wide-area
18distributed file system, AFS. Specifically, it covers the current level of
19available software, the third-generation AFS-3 system. This document will
20explore the technological climate in which AFS was developed, the nature of
21problem(s) it addresses, and how its design attacks these problems in order to
22realize the inherent benefits in such a file system. It also examines a set of
23additional features for AFS, some of which are actively being considered.
24\par
25This document is a member of a reference suite providing programming
26specifications as to the operation of and interfaces offered by the various AFS
27system components. It is intended to serve as a high-level treatment of
28distributed file systems in general and of AFS in particular. This document
29should ideally be read before any of the others in the suite, as it provides
30the organizational and philosophical framework in which they may best be
31interpreted.
32
33 \section sec1-2 Section 1.2: Document Layout
34
35\par
36Chapter 2 provides a discussion of the technological background and
37developments that created the environment in which AFS and related systems were
38inspired. Chapter 3 examines the specific set of goals that AFS was designed to
39meet, given the possibilities created by personal computing and advances in
40communication technology. Chapter 4 presents the core AFS architecture and how
41it addresses these goals. Finally, Chapter 5 considers how AFS functionality
42may be be improved by certain design changes.
43
44 \section sec1-3 Section 1.3: Related Documents
45
46\par
47The names of the other documents in the collection, along with brief summaries
48of their contents, are listed below.
49\li AFS-3 Programmer?s Reference: File Server/Cache Manager Interface: This
50document describes the File Server and Cache Manager agents, which provide the
51backbone ?le managment services for AFS. The collection of File Servers for a
52cell supplies centralized ?le storage for that site, and allows clients running
53the Cache Manager component to access those ?les in a high-performance, secure
54fashion.
55\li AFS-3 Programmer?s Reference:Volume Server/Volume Location Server
56Interface: This document describes the services through which ?containers? of
57related user data are located and managed.
58\li AFS-3 Programmer?s Reference: Protection Server Interface: This paper
59describes the server responsible for mapping printable user names to and from
60their internal AFS identi?ers. The Protection Server also allows users to
61create, destroy, and manipulate ?groups? of users, which are suitable for
62placement on Access Control Lists (ACLs).
63\li AFS-3 Programmer?s Reference: BOS Server Interface: This paper covers the
64?nanny? service which assists in the administrability of the AFS environment.
65\li AFS-3 Programmer?s Reference: Speci?cation for the Rx Remote Procedure Call
66Facility: This document speci?es the design and operation of the remote
67procedure call and lightweight process packages used by AFS.
68
69 \page chap2 Chapter 2: Technological Background
70
71\par
72Certain changes in technology over the past two decades greatly in?uenced the
73nature of computational resources, and the manner in which they were used.
74These developments created the conditions under which the notion of a
75distributed ?le systems (DFS) was born. This chapter describes these
76technological changes, and explores how a distributed ?le system attempts to
77capitalize on the new computing environment?s strengths and minimize its
78disadvantages.
79
80 \section sec2-1 Section 2.1: Shift in Computational Idioms
81
82\par
83By the beginning of the 1980s, new classes of computing engines and new methods
84by which they may be interconnected were becoming firmly established. At this
85time, a shift was occurring away from the conventional mainframe-based,
86timeshared computing environment to one in which both workstation-class
87machines and the smaller personal computers (PCs) were a strong presence.
88\par
89The new environment offered many benefits to its users when compared with
90timesharing. These smaller, self-sufficient machines moved dedicated computing
91power and cycles directly onto people's desks. Personal machines were powerful
92enough to support a wide variety of applications, and allowed for a richer,
93more intuitive, more graphically-based interface for them. Learning curves were
94greatly reduced, cutting training costs and increasing new-employee
95productivity. In addition, these machines provided a constant level of service
96throughout the day. Since a personal machine was typically only executing
97programs for a single human user, it did not suffer from timesharing's
98load-based response time degradation. Expanding the computing services for an
99organization was often accomplished by simply purchasing more of the relatively
100cheap machines. Even small organizations could now afford their own computing
101resources, over which they exercised full control. This provided more freedom
102to tailor computing services to the specific needs of particular groups.
103\par
104However, many of the benefits offered by the timesharing systems were lost when
105the computing idiom first shifted to include personal-style machines. One of
106the prime casualties of this shift was the loss of the notion of a single name
107space for all files. Instead, workstation-and PC-based environments each had
108independent and completely disconnected file systems. The standardized
109mechanisms through which files could be transferred between machines (e.g.,
110FTP) were largely designed at a time when there were relatively few large
111machines that were connected over slow links. Although the newer multi-megabit
112per second communication pathways allowed for faster transfers, the problem of
113resource location in this environment was still not addressed. There was no
114longer a system-wide file system, or even a file location service, so
115individual users were more isolated from the organization's collective data.
116Overall, disk requirements ballooned, since lack of a shared file system was
117often resolved by replicating all programs and data to each machine that needed
118it. This proliferation of independent copies further complicated the problem of
119version control and management in this distributed world. Since computers were
120often no longer behind locked doors at a computer center, user authentication
121and authorization tasks became more complex. Also, since organizational
122managers were now in direct control of their computing facilities, they had to
123also actively manage the hardware and software upon which they depended.
124\par
125Overall, many of the benefits of the proliferation of independent,
126personal-style machines were partially offset by the communication and
127organizational penalties they imposed. Collaborative work and dissemination of
128information became more difficult now that the previously unified file system
129was fragmented among hundreds of autonomous machines.
130
131 \section sec2-2 Section 2.2: Distributed File Systems
132
133\par
134As a response to the situation outlined above, the notion of a distributed file
135system (DFS) was developed. Basically, a DFS provides a framework in which
136access to files is permitted regardless of their locations. Specifically, a
137distributed file system offers a single, common set of file system operations
138through which those accesses are performed.
139\par
140There are two major variations on the core DFS concept, classified according to
141the way in which file storage is managed. These high-level models are defined
142below.
143\li Peer-to-peer: In this symmetrical model, each participating machine
144provides storage for specific set of files on its own attached disk(s), and
145allows others to access them remotely. Thus, each node in the DFS is capable of
146both importing files (making reference to files resident on foreign machines)
147and exporting files (allowing other machines to reference files located
148locally).
149\li Server-client: In this model, a set of machines designated as servers
150provide the storage for all of the files in the DFS. All other machines, known
151as clients, must direct their file references to these machines. Thus, servers
152are the sole exporters of files in the DFS, and clients are the sole importers.
153
154\par
155The notion of a DFS, whether organized using the peer-to-peer or server-client
156discipline, may be used as a conceptual base upon which the advantages of
157personal computing resources can be combined with the single-system benefits of
158classical timeshared operation.
159\par
160Many distributed file systems have been designed and deployed, operating on the
161fast local area networks available to connect machines within a single site.
162These systems include DOMAIN [9], DS [15], RFS [16], and Sprite [10]. Perhaps
163the most widespread of distributed file systems to date is a product from Sun
164Microsystems, NFS [13] [14], extending the popular unix file system so that it
165operates over local networks.
166
167 \section sec2-3 Section 2.3: Wide-Area Distributed File Systems
168
169\par
170Improvements in long-haul network technology are allowing for faster
171interconnection bandwidths and smaller latencies between distant sites.
172Backbone services have been set up across the country, and T1 (1.5
173megabit/second) links are increasingly available to a larger number of
174locations. Long-distance channels are still at best approximately an order of
175magnitude slower than the typical local area network, and often two orders of
176magnitude slower. The narrowed difference between local-area and wide-area data
177paths opens the window for the notion of a wide-area distributed file system
178(WADFS). In a WADFS, the transparency of file access offered by a local-area
179DFS is extended to cover machines across much larger distances. Wide-area file
180system functionality facilitates collaborative work and dissemination of
181information in this larger theater of operation.
182
183 \page chap3 Chapter 3: AFS-3 Design Goals
184
185 \section sec3-1 Section 3.1: Introduction
186
187\par
188This chapter describes the goals for the AFS-3 system, the first commercial
189WADFS in existence.
190\par
191The original AFS goals have been extended over the history of the project. The
192initial AFS concept was intended to provide a single distributed file system
193facility capable of supporting the computing needs of Carnegie Mellon
194University, a community of roughly 10,000 people. It was expected that most CMU
195users either had their own workstation-class machine on which to work, or had
196access to such machines located in public clusters. After being successfully
197implemented, deployed, and tuned in this capacity, it was recognized that the
198basic design could be augmented to link autonomous AFS installations located
199within the greater CMU campus. As described in Section 2.3, the long-haul
200networking environment developed to a point where it was feasible to further
201extend AFS so that it provided wide-area file service. The underlying AFS
202communication component was adapted to better handle the widely-varying channel
203characteristics encountered by intra-site and inter-site operations.
204\par
205A more detailed history of AFS evolution may be found in [3] and [18].
206
207 \section sec3-2 Section 3.2: System Goals
208
209\par
210At a high level, the AFS designers chose to extend the single-machine unix
211computing environment into a WADFS service. The unix system, in all of its
212numerous incarnations, is an important computing standard, and is in very wide
213use. Since AFS was originally intended to service the heavily unix-oriented CMU
214campus, this decision served an important tactical purpose along with its
215strategic ramifications.
216\par
217In addition, the server-client discipline described in Section 2.2 was chosen
218as the organizational base for AFS. This provides the notion of a central file
219store serving as the primary residence for files within a given organization.
220These centrally-stored files are maintained by server machines and are made
221accessible to computers running the AFS client software.
222\par
223Listed in the following sections are the primary goals for the AFS system.
224Chapter 4 examines how the AFS design decisions, concepts, and implementation
225meet this list of goals.
226
227 \subsection sec3-2-1 Section 3.2.1: Scale
228
229\par
230AFS differs from other existing DFSs in that it has the specific goal of
231supporting a very large user community with a small number of server machines.
232Unlike the rule-of-thumb ratio of approximately 20 client machines for every
233server machine (20:1) used by Sun Microsystem's widespread NFS distributed file
234system, the AFS architecture aims at smoothly supporting client/server ratios
235more along the lines of 200:1 within a single installation. In addition to
236providing a DFS covering a single organization with tens of thousands of users,
237AFS also aims at allowing thousands of independent, autonomous organizations to
238join in the single, shared name space (see Section 3.2.2 below) without a
239centralized control or coordination point. Thus, AFS envisions supporting the
240file system needs of tens of millions of users at interconnected yet autonomous
241sites.
242
243 \subsection sec3-2-2 Section 3.2.2: Name Space
244
245\par
246One of the primary strengths of the timesharing computing environment is the
247fact that it implements a single name space for all files in the system. Users
248can walk up to any terminal connected to a timesharing service and refer to its
249files by the identical name. This greatly encourages collaborative work and
250dissemination of information, as everyone has a common frame of reference. One
251of the major AFS goals is the extension of this concept to a WADFS. Users
252should be able to walk up to any machine acting as an AFS client, anywhere in
253the world, and use the identical file name to refer to a given object.
254\par
255In addition to the common name space, it was also an explicit goal for AFS to
256provide complete access transparency and location transparency for its files.
257Access transparency is defined as the system's ability to use a single
258mechanism to operate on a file, regardless of its location, local or remote.
259Location transparency is defined as the inability to determine a file's
260location from its name. A system offering location transparency may also
261provide transparent file mobility, relocating files between server machines
262without visible effect to the naming system.
263
264 \subsection sec3-2-3 Section 3.2.3: Performance
265
266\par
267Good system performance is a critical AFS goal, especially given the scale,
268client-server ratio, and connectivity specifications described above. The AFS
269architecture aims at providing file access characteristics which, on average,
270are similar to those of local disk performance.
271
272 \subsection sec3-2-4 Section 3.2.4: Security
273
274\par
275A production WADFS, especially one which allows and encourages transparent file
276access between different administrative domains, must be extremely conscious of
277security issues. AFS assumes that server machines are "trusted" within their
278own administrative domain, being kept behind locked doors and only directly
279manipulated by reliable administrative personnel. On the other hand, AFS client
280machines are assumed to exist in inherently insecure environments, such as
281offices and dorm rooms. These client machines are recognized to be
282unsupervisable, and fully accessible to their users. This situation makes AFS
283servers open to attacks mounted by possibly modified client hardware, firmware,
284operating systems, and application software. In addition, while an organization
285may actively enforce the physical security of its own file servers to its
286satisfaction, other organizations may be lax in comparison. It is important to
287partition the system's security mechanism so that a security breach in one
288administrative domain does not allow unauthorized access to the facilities of
289other autonomous domains.
290\par
291The AFS system is targeted to provide confidence in the ability to protect
292system data from unauthorized access in the above environment, where untrusted
293client hardware and software may attempt to perform direct remote file
294operations from anywhere in the world, and where levels of physical security at
295remote sites may not meet the standards of other sites.
296
297 \subsection sec3-2-5 Section 3.2.5: Access Control
298
299\par
300The standard unix access control mechanism associates mode bits with every file
301and directory, applying them based on the user's numerical identifier and the
302user's membership in various groups. This mechanism was considered too
303coarse-grained by the AFS designers. It was seen as insufficient for specifying
304the exact set of individuals and groups which may properly access any given
305file, as well as the operations these principals may perform. The unix group
306mechanism was also considered too coarse and inflexible. AFS was designed to
307provide more flexible and finer-grained control of file access, improving the
308ability to define the set of parties which may operate on files, and what their
309specific access rights are.
310
311 \subsection sec3-2-6 Section 3.2.6: Reliability
312
313\par
314The crash of a server machine in any distributed file system causes the
315information it hosts to become unavailable to the user community. The same
316effect is observed when server and client machines are isolated across a
317network partition. Given the potential size of the AFS user community, a single
318server crash could potentially deny service to a very large number of people.
319The AFS design reflects a desire to minimize the visibility and impact of these
320inevitable server crashes.
321
322 \subsection sec3-2-7 Section 3.2.7: Administrability
323
324\par
325Driven once again by the projected scale of AFS operation, one of the system's
326goals is to offer easy administrability. With the large projected user
327population, the amount of file data expected to be resident in the shared file
328store, and the number of machines in the environment, a WADFS could easily
329become impossible to administer unless its design allowed for easy monitoring
330and manipulation of system resources. It is also imperative to be able to apply
331security and access control mechanisms to the administrative interface.
332
333 \subsection sec3-2-8 Section 3.2.8: Interoperability/Coexistence
334
335\par
336Many organizations currently employ other distributed file systems, most
337notably Sun Microsystem's NFS, which is also an extension of the basic
338single-machine unix system. It is unlikely that AFS will receive significant
339use if it cannot operate concurrently with other DFSs without mutual
340interference. Thus, coexistence with other DFSs is an explicit AFS goal.
341\par
342A related goal is to provide a way for other DFSs to interoperate with AFS to
343various degrees, allowing AFS file operations to be executed from these
344competing systems. This is advantageous, since it may extend the set of
345machines which are capable of interacting with the AFS community. Hardware
346platforms and/or operating systems to which AFS is not ported may thus be able
347to use their native DFS system to perform AFS file references.
348\par
349These two goals serve to extend AFS coverage, and to provide a migration path
350by which potential clients may sample AFS capabilities, and gain experience
351with AFS. This may result in data migration into native AFS systems, or the
352impetus to acquire a native AFS implementation.
353
354 \subsection sec3-2-9 Section 3.2.9: Heterogeneity/Portability
355
356\par
357It is important for AFS to operate on a large number of hardware platforms and
358operating systems, since a large community of unrelated organizations will most
359likely utilize a wide variety of computing environments. The size of the
360potential AFS user community will be unduly restricted if AFS executes on a
361small number of platforms. Not only must AFS support a largely heterogeneous
362computing base, it must also be designed to be easily portable to new hardware
363and software releases in order to maintain this coverage over time.
364
365 \page chap4 Chapter 4: AFS High-Level Design
366
367 \section sec4-1 Section 4.1: Introduction
368
369\par
370This chapter presents an overview of the system architecture for the AFS-3
371WADFS. Different treatments of the AFS system may be found in several
372documents, including [3], [4], [5], and [2]. Certain system features discussed
373here are examined in more detail in the set of accompanying AFS programmer
374specification documents.
375\par
376After the archtectural overview, the system goals enumerated in Chapter 3 are
377revisited, and the contribution of the various AFS design decisions and
378resulting features is noted.
379
380 \section sec4-2 Section 4.2: The AFS System Architecture
381
382 \subsection sec4-2-1 Section 4.2.1: Basic Organization
383
384\par
385As stated in Section 3.2, a server-client organization was chosen for the AFS
386system. A group of trusted server machines provides the primary disk space for
387the central store managed by the organization controlling the servers. File
388system operation requests for specific files and directories arrive at server
389machines from machines running the AFS client software. If the client is
390authorized to perform the operation, then the server proceeds to execute it.
391\par
392In addition to this basic file access functionality, AFS server machines also
393provide related system services. These include authentication service, mapping
394between printable and numerical user identifiers, file location service, time
395service, and such administrative operations as disk management, system
396reconfiguration, and tape backup.
397
398 \subsection sec4-2-2 Section 4.2.2: Volumes
399
400 \subsubsection sec4-2-2-1 Section 4.2.2.1: Definition
401
402\par
403Disk partitions used for AFS storage do not directly host individual user files
404and directories. Rather, connected subtrees of the system's directory structure
405are placed into containers called volumes. Volumes vary in size dynamically as
406the objects it houses are inserted, overwritten, and deleted. Each volume has
407an associated quota, or maximum permissible storage. A single unix disk
408partition may thus host one or more volumes, and in fact may host as many
409volumes as physically fit in the storage space. However, the practical maximum
410is currently 3,500 volumes per disk partition. This limitation is imposed by
411the salvager program, which examines and repairs file system metadata
412structures.
413\par
414There are two ways to identify an AFS volume. The first option is a 32-bit
415numerical value called the volume ID. The second is a human-readable character
416string called the volume name.
417\par
418Internally, a volume is organized as an array of mutable objects, representing
419individual files and directories. The file system object associated with each
420index in this internal array is assigned a uniquifier and a data version
421number. A subset of these values are used to compose an AFS file identifier, or
422FID. FIDs are not normally visible to user applications, but rather are used
423internally by AFS. They consist of ordered triplets, whose components are the
424volume ID, the index within the volume, and the uniquifier for the index.
425\par
426To understand AFS FIDs, let us consider the case where index i in volume v
427refers to a file named example.txt. This file's uniquifier is currently set to
428one (1), and its data version number is currently set to zero (0). The AFS
429client software may then refer to this file with the following FID: (v, i, 1).
430The next time a client overwrites the object identified with the (v, i, 1) FID,
431the data version number for example.txt will be promoted to one (1). Thus, the
432data version number serves to distinguish between different versions of the
433same file. A higher data version number indicates a newer version of the file.
434\par
435Consider the result of deleting file (v, i, 1). This causes the body of
436example.txt to be discarded, and marks index i in volume v as unused. Should
437another program create a file, say a.out, within this volume, index i may be
438reused. If it is, the creation operation will bump the index's uniquifier to
439two (2), and the data version number is reset to zero (0). Any client caching a
440FID for the deleted example.txt file thus cannot affect the completely
441unrelated a.out file, since the uniquifiers differ.
442
443 \subsubsection sec4-2-2-2 Section 4.2.2.2: Attachment
444
445\par
446The connected subtrees contained within individual volumes are attached to
447their proper places in the file space defined by a site, forming a single,
448apparently seamless unix tree. These attachment points are called mount points.
449These mount points are persistent file system objects, implemented as symbolic
450links whose contents obey a stylized format. Thus, AFS mount points differ from
451NFS-style mounts. In the NFS environment, the user dynamically mounts entire
452remote disk partitions using any desired name. These mounts do not survive
453client restarts, and do not insure a uniform namespace between different
454machines.
455\par
456A single volume is chosen as the root of the AFS file space for a given
457organization. By convention, this volume is named root.afs. Each client machine
458belonging to this organization peforms a unix mount() of this root volume (not
459to be confused with an AFS mount point) on its empty /afs directory, thus
460attaching the entire AFS name space at this point.
461
462 \subsubsection sec4-2-2-3 Section 4.2.2.3: Administrative Uses
463
464\par
465Volumes serve as the administrative unit for AFS ?le system data, providing as
466the basis for replication, relocation, and backup operations.
467
468 \subsubsection sec4-2-2-4 Section 4.2.2.4: Replication
469
470Read-only snapshots of AFS volumes may be created by administrative personnel.
471These clones may be deployed on up to eight disk partitions, on the same server
472machine or across di?erent servers. Each clone has the identical volume ID,
473which must di?er from its read-write parent. Thus, at most one clone of any
474given volume v may reside on a given disk partition. File references to this
475read-only clone volume may be serviced by any of the servers which host a copy.
476
477 \subsubsection sec4-2-2-4 Section 4.2.2.5: Backup
478
479\par
480Volumes serve as the unit of tape backup and restore operations. Backups are
481accomplished by first creating an on-line backup volume for each volume to be
482archived. This backup volume is organized as a copy-on-write shadow of the
483original volume, capturing the volume's state at the instant that the backup
484took place. Thus, the backup volume may be envisioned as being composed of a
485set of object pointers back to the original image. The first update operation
486on the file located in index i of the original volume triggers the
487copy-on-write association. This causes the file's contents at the time of the
488snapshot to be physically written to the backup volume before the newer version
489of the file is stored in the parent volume.
490\par
491Thus, AFS on-line backup volumes typically consume little disk space. On
492average, they are composed mostly of links and to a lesser extent the bodies of
493those few files which have been modified since the last backup took place.
494Also, the system does not have to be shut down to insure the integrity of the
495backup images. Dumps are generated from the unchanging backup volumes, and are
496transferred to tape at any convenient time before the next backup snapshot is
497performed.
498
499 \subsubsection sec4-2-2-6 Section 4.2.2.6: Relocation
500
501\par
502Volumes may be moved transparently between disk partitions on a given file
503server, or between different file server machines. The transparency of volume
504motion comes from the fact that neither the user-visible names for the files
505nor the internal AFS FIDs contain server-specific location information.
506\par
507Interruption to file service while a volume move is being executed is typically
508on the order of a few seconds, regardless of the amount of data contained
509within the volume. This derives from the staged algorithm used to move a volume
510to a new server. First, a dump is taken of the volume's contents, and this
511image is installed at the new site. The second stage involves actually locking
512the original volume, taking an incremental dump to capture file updates since
513the first stage. The third stage installs the changes at the new site, and the
514fourth stage deletes the original volume. Further references to this volume
515will resolve to its new location.
516
517 \subsection sec4-2-3 Section 4.2.3: Authentication
518
519\par
520AFS uses the Kerberos [22] [23] authentication system developed at MIT's
521Project Athena to provide reliable identification of the principals attempting
522to operate on the files in its central store. Kerberos provides for mutual
523authentication, not only assuring AFS servers that they are interacting with
524the stated user, but also assuring AFS clients that they are dealing with the
525proper server entities and not imposters. Authentication information is
526mediated through the use of tickets. Clients register passwords with the
527authentication system, and use those passwords during authentication sessions
528to secure these tickets. A ticket is an object which contains an encrypted
529version of the user's name and other information. The file server machines may
530request a caller to present their ticket in the course of a file system
531operation. If the file server can successfully decrypt the ticket, then it
532knows that it was created and delivered by the authentication system, and may
533trust that the caller is the party identified within the ticket.
534\par
535Such subjects as mutual authentication, encryption and decryption, and the use
536of session keys are complex ones. Readers are directed to the above references
537for a complete treatment of Kerberos-based authentication.
538
539 \subsection sec4-2-4 Section 4.2.4: Authorization
540
541 \subsubsection sec4-2-4-1 Section 4.2.4.1: Access Control Lists
542
543\par
544AFS implements per-directory Access Control Lists (ACLs) to improve the ability
545to specify which sets of users have access to the ?les within the directory,
546and which operations they may perform. ACLs are used in addition to the
547standard unix mode bits. ACLs are organized as lists of one or more (principal,
548rights) pairs. A principal may be either the name of an individual user or a
549group of individual users. There are seven expressible rights, as listed below.
550\li Read (r): The ability to read the contents of the files in a directory.
551\li Lookup (l): The ability to look up names in a directory.
552\li Write (w): The ability to create new files and overwrite the contents of
553existing files in a directory.
554\li Insert (i): The ability to insert new files in a directory, but not to
555overwrite existing files.
556\li Delete (d): The ability to delete files in a directory.
557\li Lock (k): The ability to acquire and release advisory locks on a given
558directory.
559\li Administer (a): The ability to change a directory's ACL.
560
561 \subsubsection sec4-2-4-2 Section 4.2.4.2: AFS Groups
562
563\par
564AFS users may create a certain number of groups, differing from the standard
565unix notion of group. These AFS groups are objects that may be placed on ACLs,
566and simply contain a list of AFS user names that are to be treated identically
567for authorization purposes. For example, user erz may create a group called
568erz:friends consisting of the kazar, vasilis, and mason users. Should erz wish
569to grant read, lookup, and insert rights to this group in directory d, he
570should create an entry reading (erz:friends, rli) in d's ACL.
571\par
572AFS offers three special, built-in groups, as described below.
573\par
5741. system:anyuser: Any individual who accesses AFS files is considered by the
575system to be a member of this group, whether or not they hold an authentication
576ticket. This group is unusual in that it doesn't have a stable membership. In
577fact, it doesn't have an explicit list of members. Instead, the system:anyuser
578"membership" grows and shrinks as file accesses occur, with users being
579(conceptually) added and deleted automatically as they interact with the
580system.
581\par
582The system:anyuser group is typically put on the ACL of those directories for
583which some specific level of completely public access is desired, covering any
584user at any AFS site.
585\par
5862. system:authuser: Any individual in possession of a valid Kerberos ticket
587minted by the organization's authentication service is treated as a member of
588this group. Just as with system:anyuser, this special group does not have a
589stable membership. If a user acquires a ticket from the authentication service,
590they are automatically "added" to the group. If the ticket expires or is
591discarded by the user, then the given individual will automatically be
592"removed" from the group.
593\par
594The system:authuser group is usually put on the ACL of those directories for
595which some specific level of intra-site access is desired. Anyone holding a
596valid ticket within the organization will be allowed to perform the set of
597accesses specified by the ACL entry, regardless of their precise individual ID.
598\par
5993. system:administrators: This built-in group de?nes the set of users capable
600of performing certain important administrative operations within the cell.
601Members of this group have explicit 'a' (ACL administration) rights on every
602directory's ACL in the organization. Members of this group are the only ones
603which may legally issue administrative commands to the file server machines
604within the organization. This group is not like the other two described above
605in that it does have a stable membership, where individuals are added and
606deleted from the group explicitly.
607\par
608The system:administrators group is typically put on the ACL of those
609directories which contain sensitive administrative information, or on those
610places where only administrators are allowed to make changes. All members of
611this group have implicit rights to change the ACL on any AFS directory within
612their organization. Thus, they don't have to actually appear on an ACL, or have
613'a' rights enabled in their ACL entry if they do appear, to be able to modify
614the ACL.
615
616 \subsection sec4-2-5 Section 4.2.5: Cells
617
618\par
619A cell is the set of server and client machines managed and operated by an
620administratively independent organization, as fully described in the original
621proposal [17] and specification [18] documents. The cell's administrators make
622decisions concerning such issues as server deployment and configuration, user
623backup schedules, and replication strategies on their own hardware and disk
624storage completely independently from those implemented by other cell
625administrators regarding their own domains. Every client machine belongs to
626exactly one cell, and uses that information to determine where to obtain
627default system resources and services.
628\par
629The cell concept allows autonomous sites to retain full administrative control
630over their facilities while allowing them to collaborate in the establishment
631of a single, common name space composed of the union of their individual name
632spaces. By convention, any file name beginning with /afs is part of this shared
633global name space and can be used at any AFS-capable machine. The original
634mount point concept was modified to contain cell information, allowing volumes
635housed in foreign cells to be mounted in the file space. Again by convention,
636the top-level /afs directory contains a mount point to the root.cell volume for
637each cell in the AFS community, attaching their individual file spaces. Thus,
638the top of the data tree managed by cell xyz is represented by the /afs/xyz
639directory.
640\par
641Creating a new AFS cell is straightforward, with the operation taking three
642basic steps:
643\par
6441. Name selection: A prospective site has to first select a unique name for
645itself. Cell name selection is inspired by the hierarchical Domain naming
646system. Domain-style names are designed to be assignable in a completely
647decentralized fashion. Example cell names are transarc.com, ssc.gov, and
648umich.edu. These names correspond to the AFS installations at Transarc
649Corporation in Pittsburgh, PA, the Superconducting Supercollider Lab in Dallas,
650TX, and the University of Michigan at Ann Arbor, MI. respectively.
651\par
6522. Server installation: Once a cell name has been chosen, the site must bring
653up one or more AFS file server machines, creating a local file space and a
654suite of local services, including authentication (Section 4.2.6.4) and volume
655location (Section 4.2.6.2).
656\par
6573. Advertise services: In order for other cells to discover the presence of the
658new site, it must advertise its name and which of its machines provide basic
659AFS services such as authentication and volume location. An established site
660may then record the machines providing AFS system services for the new cell,
661and then set up its mount point under /afs. By convention, each cell places the
662top of its file tree in a volume named root.cell.
663
664 \subsection sec4-2-6 Section 4.2.6: Implementation of Server
665Functionality
666
667\par
668AFS server functionality is implemented by a set of user-level processes which
669execute on server machines. This section examines the role of each of these
670processes.
671
672 \subsubsection sec4-2-6-1 Section 4.2.6.1: File Server
673
674\par
675This AFS entity is responsible for providing a central disk repository for a
676particular set of files within volumes, and for making these files accessible
677to properly-authorized users running on client machines.
678
679 \subsubsection sec4-2-6-2 Section 4.2.6.2: Volume Location Server
680
681\par
682The Volume Location Server maintains and exports the Volume Location Database
683(VLDB). This database tracks the server or set of servers on which volume
684instances reside. Among the operations it supports are queries returning volume
685location and status information, volume ID management, and creation, deletion,
686and modification of VLDB entries.
687\par
688The VLDB may be replicated to two or more server machines for availability and
689load-sharing reasons. A Volume Location Server process executes on each server
690machine on which a copy of the VLDB resides, managing that copy.
691
692 \subsubsection sec4-2-6-3 Section 4.2.6.3: Volume Server
693
694\par
695The Volume Server allows administrative tasks and probes to be performed on the
696set of AFS volumes residing on the machine on which it is running. These
697operations include volume creation and deletion, renaming volumes, dumping and
698restoring volumes, altering the list of replication sites for a read-only
699volume, creating and propagating a new read-only volume image, creation and
700update of backup volumes, listing all volumes on a partition, and examining
701volume status.
702
703 \subsubsection sec4-2-6-4 Section 4.2.6.4: Authentication Server
704
705\par
706The AFS Authentication Server maintains and exports the Authentication Database
707(ADB). This database tracks the encrypted passwords of the cell's users. The
708Authentication Server interface allows operations that manipulate ADB entries.
709It also implements the Kerberos mutual authentication protocol, supplying the
710appropriate identification tickets to successful callers.
711\par
712The ADB may be replicated to two or more server machines for availability and
713load-sharing reasons. An Authentication Server process executes on each server
714machine on which a copy of the ADB resides, managing that copy.
715
716 \subsubsection sec4-2-6-5 Section 4.2.6.5: Protection Server
717
718\par
719The Protection Server maintains and exports the Protection Database (PDB),
720which maps between printable user and group names and their internal numerical
721AFS identifiers. The Protection Server also allows callers to create, destroy,
722query ownership and membership, and generally manipulate AFS user and group
723records.
724\par
725The PDB may be replicated to two or more server machines for availability and
726load-sharing reasons. A Protection Server process executes on each server
727machine on which a copy of the PDB resides, managing that copy.
728
729 \subsubsection sec4-2-6-6 Section 4.2.6.6: BOS Server
730
731\par
732The BOS Server is an administrative tool which runs on each file server machine
733in a cell. This server is responsible for monitoring the health of the AFS
734agent processess on that machine. The BOS Server brings up the chosen set of
735AFS agents in the proper order after a system reboot, answers requests as to
736their status, and restarts them when they fail. It also accepts commands to
737start, suspend, or resume these processes, and install new server binaries.
738
739 \subsubsection sec4-2-6-7 Section 4.2.6.7: Update Server/Client
740
741\par
742The Update Server and Update Client programs are used to distribute important
743system files and server binaries. For example, consider the case of
744distributing a new File Server binary to the set of Sparcstation server
745machines in a cell. One of the Sparcstation servers is declared to be the
746distribution point for its machine class, and is configured to run an Update
747Server. The new binary is installed in the appropriate local directory on that
748Sparcstation distribution point. Each of the other Sparcstation servers runs an
749Update Client instance, which periodically polls the proper Update Server. The
750new File Server binary will be detected and copied over to the client. Thus,
751new server binaries need only be installed manually once per machine type, and
752the distribution to like server machines will occur automatically.
753
754 \subsection sec4-2-7 Section 4.2.7: Implementation of Client
755Functionality
756
757 \subsubsection sec4-2-7-1 Section 4.2.7.1: Introduction
758
759\par
760The portion of the AFS WADFS which runs on each client machine is called the
761Cache Manager. This code, running within the client's kernel, is a user's
762representative in communicating and interacting with the File Servers. The
763Cache Manager's primary responsibility is to create the illusion that the
764remote AFS file store resides on the client machine's local disk(s).
765\par
766s implied by its name, the Cache Manager supports this illusion by maintaining
767a cache of files referenced from the central AFS store on the machine's local
768disk. All file operations executed by client application programs on files
769within the AFS name space are handled by the Cache Manager and are realized on
770these cached images. Client-side AFS references are directed to the Cache
771Manager via the standard VFS and vnode file system interfaces pioneered and
772advanced by Sun Microsystems [21]. The Cache Manager stores and fetches files
773to and from the shared AFS repository as necessary to satisfy these operations.
774It is responsible for parsing unix pathnames on open() operations and mapping
775each component of the name to the File Server or group of File Servers that
776house the matching directory or file.
777\par
778The Cache Manager has additional responsibilities. It also serves as a reliable
779repository for the user's authentication information, holding on to their
780tickets and wielding them as necessary when challenged during File Server
781interactions. It caches volume location information gathered from probes to the
782VLDB, and keeps the client machine's local clock synchronized with a reliable
783time source.
784
785 \subsubsection sec4-2-7-2 Section 4.2.7.2: Chunked Access
786
787\par
788In previous AFS incarnations, whole-file caching was performed. Whenever an AFS
789file was referenced, the entire contents of the file were stored on the
790client's local disk. This approach had several disadvantages. One problem was
791that no file larger than the amount of disk space allocated to the client's
792local cache could be accessed.
793\par
794AFS-3 supports chunked file access, allowing individual 64 kilobyte pieces to
795be fetched and stored. Chunking allows AFS files of any size to be accessed
796from a client. The chunk size is settable at each client machine, but the
797default chunk size of 64K was chosen so that most unix files would fit within a
798single chunk.
799
800 \subsubsection sec4-2-7-3 Section 4.2.7.3: Cache Management
801
802\par
803The use of a file cache by the AFS client-side code, as described above, raises
804the thorny issue of cache consistency. Each client must effciently determine
805whether its cached file chunks are identical to the corresponding sections of
806the file as stored at the server machine before allowing a user to operate on
807those chunks.
808\par
809AFS employs the notion of a callback as the backbone of its cache consistency
810algorithm. When a server machine delivers one or more chunks of a file to a
811client, it also includes a callback "promise" that the client will be notified
812if any modifications are made to the data in the file at the server. Thus, as
813long as the client machine is in possession of a callback for a file, it knows
814it is correctly synchronized with the centrally-stored version, and allows its
815users to operate on it as desired without any further interaction with the
816server. Before a file server stores a more recent version of a file on its own
817disks, it will first break all outstanding callbacks on this item. A callback
818will eventually time out, even if there are no changes to the file or directory
819it covers.
820
821 \subsection sec4-2-8 Section 4.2.8: Communication Substrate: Rx
822
823\par
824All AFS system agents employ remote procedure call (RPC) interfaces. Thus,
825servers may be queried and operated upon regardless of their location.
826\par
827The Rx RPC package is used by all AFS agents to provide a high-performance,
828multi-threaded, and secure communication mechanism. The Rx protocol is
829adaptive, conforming itself to widely varying network communication media
830encountered by a WADFS. It allows user applications to de?ne and insert their
831own security modules, allowing them to execute the precise end-to-end
832authentication algorithms required to suit their specific needs and goals. Rx
833offers two built-in security modules. The first is the null module, which does
834not perform any encryption or authentication checks. The second built-in
835security module is rxkad, which utilizes Kerberos authentication.
836\par
837Although pervasive throughout the AFS distributed file system, all of its
838agents, and many of its standard application programs, Rx is entirely separable
839from AFS and does not depend on any of its features. In fact, Rx can be used to
840build applications engaging in RPC-style communication under a variety of
841unix-style file systems. There are in-kernel and user-space implementations of
842the Rx facility, with both sharing the same interface.
843
844 \subsection sec4-2-9 Section 4.2.9: Database Replication: ubik
845
846\par
847The three AFS system databases (VLDB, ADB, and PDB) may be replicated to
848multiple server machines to improve their availability and share access loads
849among the replication sites. The ubik replication package is used to implement
850this functionality. A full description of ubik and of the quorum completion
851algorithm it implements may be found in [19] and [20].
852\par
853The basic abstraction provided by ubik is that of a disk file replicated to
854multiple server locations. One machine is considered to be the synchronization
855site, handling all write operations on the database file. Read operations may
856be directed to any of the active members of the quorum, namely a subset of the
857replication sites large enough to insure integrity across such failures as
858individual server crashes and network partitions. All of the quorum members
859participate in regular elections to determine the current synchronization site.
860The ubik algorithms allow server machines to enter and exit the quorum in an
861orderly and consistent fashion.
862\par
863All operations to one of these replicated "abstract files" are performed as
864part of a transaction. If all the related operations performed under a
865transaction are successful, then the transaction is committed, and the changes
866are made permanent. Otherwise, the transaction is aborted, and all of the
867operations for that transaction are undone.
868\par
869Like Rx, the ubik facility may be used by client applications directly. Thus,
870user applicatons may easily implement the notion of a replicated disk file in
871this fashion.
872
873 \subsection sec4-2-10 Section 4.2.10: System Management
874
875\par
876There are several AFS features aimed at facilitating system management. Some of
877these features have already been mentioned, such as volumes, the BOS Server,
878and the pervasive use of secure RPCs throughout the system to perform
879administrative operations from any AFS client machinein the worldwide
880community. This section covers additional AFS features and tools that assist in
881making the system easier to manage.
882
883 \subsubsection sec4-2-10-1 Section 4.2.10.1: Intelligent Access
884Programs
885
886\par
887A set of intelligent user-level applications were written so that the AFS
888system agents could be more easily queried and controlled. These programs
889accept user input, then translate the caller's instructions into the proper
890RPCs to the responsible AFS system agents, in the proper order.
891\par
892An example of this class of AFS application programs is vos, which mediates
893access to the Volume Server and the Volume Location Server agents. Consider the
894vos move operation, which results in a given volume being moved from one site
895to another. The Volume Server does not support a complex operation like a
896volume move directly. In fact, this move operation involves the Volume Servers
897at the current and new machines, as well as the Volume Location Server, which
898tracks volume locations. Volume moves are accomplished by a combination of full
899and incremental volume dump and restore operations, and a VLDB update. The vos
900move command issues the necessary RPCs in the proper order, and attempts to
901recovers from errors at each of the steps.
902\par
903The end result is that the AFS interface presented to system administrators is
904much simpler and more powerful than that offered by the raw RPC interfaces
905themselves. The learning curve for administrative personnel is thus flattened.
906Also, automatic execution of complex system operations are more likely to be
907successful, free from human error.
908
909 \subsubsection sec4-2-10-2 Section 4.2.10.2: Monitoring Interfaces
910
911\par
912The various AFS agent RPC interfaces provide calls which allow for the
913collection of system status and performance data. This data may be displayed by
914such programs as scout, which graphically depicts File Server performance
915numbers and disk utilizations. Such monitoring capabilites allow for quick
916detection of system problems. They also support detailed performance analyses,
917which may indicate the need to reconfigure system resources.
918
919 \subsubsection sec4-2-10-3 Section 4.2.10.3: Backup System
920
921\par
922A special backup system has been designed and implemented for AFS, as described
923in [6]. It is not sufficient to simply dump the contents of all File Server
924partitions onto tape, since volumes are mobile, and need to be tracked
925individually. The AFS backup system allows hierarchical dump schedules to be
926built based on volume names. It generates the appropriate RPCs to create the
927required backup volumes and to dump these snapshots to tape. A database is used
928to track the backup status of system volumes, along with the set of tapes on
929which backups reside.
930
931 \subsection sec4-2-11 Section 4.2.11: Interoperability
932
933\par
934Since the client portion of the AFS software is implemented as a standard
935VFS/vnode file system object, AFS can be installed into client kernels and
936utilized without interference with other VFS-style file systems, such as
937vanilla unix and the NFS distributed file system.
938\par
939Certain machines either cannot or choose not to run the AFS client software
940natively. If these machines run NFS, it is still possible to access AFS files
941through a protocol translator. The NFS-AFS Translator may be run on any machine
942at the given site that runs both NFS and the AFS Cache Manager. All of the NFS
943machines that wish to access the AFS shared store proceed to NFS-mount the
944translator's /afs directory. File references generated at the NFS-based
945machines are received at the translator machine, which is acting in its
946capacity as an NFS server. The file data is actually obtained when the
947translator machine issues the corresponding AFS references in its role as an
948AFS client.
949
950 \section sec4-3 Section 4.3: Meeting AFS Goals
951
952\par
953The AFS WADFS design, as described in this chapter, serves to meet the system
954goals stated in Chapter 3. This section revisits each of these AFS goals, and
955identifies the specific architectural constructs that bear on them.
956
957 \subsection sec4-3-1 Section 4.3.1: Scale
958
959\par
960To date, AFS has been deployed to over 140 sites world-wide, with approximately
96160 of these cells visible on the public Internet. AFS sites are currently
962operating in several European countries, in Japan, and in Australia. While many
963sites are modest in size, certain cells contain more than 30,000 accounts. AFS
964sites have realized client/server ratios in excess of the targeted 200:1.
965
966 \subsection sec4-3-2 Section 4.3.2: Name Space
967
968\par
969A single uniform name space has been constructed across all cells in the
970greater AFS user community. Any pathname beginning with /afs may indeed be used
971at any AFS client. A set of common conventions regarding the organization of
972the top-level /afs directory and several directories below it have been
973established. These conventions also assist in the location of certain per-cell
974resources, such as AFS configuration files.
975\par
976Both access transparency and location transparency are supported by AFS, as
977evidenced by the common access mechanisms and by the ability to transparently
978relocate volumes.
979
980 \subsection sec4-3-3 Section 4.3.3: Performance
981
982\par
983AFS employs caching extensively at all levels to reduce the cost of "remote"
984references. Measured data cache hit ratios are very high, often over 95%. This
985indicates that the file images kept on local disk are very effective in
986satisfying the set of remote file references generated by clients. The
987introduction of file system callbacks has also been demonstrated to be very
988effective in the efficient implementation of cache synchronization. Replicating
989files and system databases across multiple server machines distributes load
990among the given servers. The Rx RPC subsystem has operated successfully at
991network speeds ranging from 19.2 kilobytes/second to experimental
992gigabit/second FDDI networks.
993\par
994Even at the intra-site level, AFS has been shown to deliver good performance,
995especially in high-load situations. One often-quoted study [1] compared the
996performance of an older version of AFS with that of NFS on a large file system
997task named the Andrew Benchmark. While NFS sometimes outperformed AFS at low
998load levels, its performance fell off rapidly at higher loads while AFS
999performance degradation was not significantly affected.
1000
1001 \subsection sec4-3-4 Section 4.3.4: Security
1002
1003\par
1004The use of Kerberos as the AFS authentication system fits the security goal
1005nicely. Access to AFS files from untrusted client machines is predicated on the
1006caller's possession of the appropriate Kerberos ticket(s). Setting up per-site,
1007Kerveros-based authentication services compartmentalizes any security breach to
1008the cell which was compromised. Since the Cache Manager will store multiple
1009tickets for its users, they may take on different identities depending on the
1010set of file servers being accessed.
1011
1012 \subsection sec4-3-5 Section 4.3.5: Access Control
1013
1014\par
1015AFS extends the standard unix authorization mechanism with per-directory Access
1016Control Lists. These ACLs allow specific AFS principals and groups of these
1017principals to be granted a wide variety of rights on the associated files.
1018Users may create and manipulate AFS group entities without administrative
1019assistance, and place these tailored groups on ACLs.
1020
1021 \subsection sec4-3-6 Section 4.3.6: Reliability
1022
1023\par
1024A subset of file server crashes are masked by the use of read-only replication
1025on volumes containing slowly-changing files. Availability of important,
1026frequently-used programs such as editors and compilers may thus been greatly
1027improved. Since the level of replication may be chosen per volume, and easily
1028changed, each site may decide the proper replication levels for certain
1029programs and/or data.
1030Similarly, replicated system databases help to maintain service in the face of
1031server crashes and network partitions.
1032
1033 \subsection sec4-3-7 Section 4.3.7: Administrability
1034
1035\par
1036Such features as pervasive, secure RPC interfaces to all AFS system components,
1037volumes, overseer processes for monitoring and management of file system
1038agents, intelligent user-level access tools, interface routines providing
1039performance and statistics information, and an automated backup service
1040tailored to a volume-based environment all contribute to the administrability
1041of the AFS system.
1042
1043 \subsection sec4-3-8 Section 4.3.8: Interoperability/Coexistence
1044
1045\par
1046Due to its VFS-style implementation, the AFS client code may be easily
1047installed in the machine's kernel, and may service file requests without
1048interfering in the operation of any other installed file system. Machines
1049either not capable of running AFS natively or choosing not to do so may still
1050access AFS files via NFS with the help of a protocol translator agent.
1051
1052 \subsection sec4-3-9 Section 4.3.9: Heterogeneity/Portability
1053
1054\par
1055As most modern kernels use a VFS-style interface to support their native file
1056systems, AFS may usually be ported to a new hardware and/or software
1057environment in a relatively straightforward fashion. Such ease of porting
1058allows AFS to run on a wide variety of platforms.
1059
1060 \page chap5 Chapter 5: Future AFS Design Re?nements
1061
1062 \section sec5-1 Section 5.1: Overview
1063
1064\par
1065The current AFS WADFS design and implementation provides a high-performance,
1066scalable, secure, and flexible computing environment. However, there is room
1067for improvement on a variety of fronts. This chapter considers a set of topics,
1068examining the shortcomings of the current AFS system and considering how
1069additional functionality may be fruitfully constructed.
1070\par
1071Many of these areas are already being addressed in the next-generation AFS
1072system which is being built as part of Open Software Foundation?s (OSF)
1073Distributed Computing Environment [7] [8].
1074
1075 \section sec5-2 Section 5.2: unix Semantics
1076
1077\par
1078Any distributed file system which extends the unix file system model to include
1079remote file accesses presents its application programs with failure modes which
1080do not exist in a single-machine unix implementation. This semantic difference
1081is dificult to mask.
1082\par
1083The current AFS design varies from pure unix semantics in other ways. In a
1084single-machine unix environment, modifications made to an open file are
1085immediately visible to other processes with open file descriptors to the same
1086file. AFS does not reproduce this behavior when programs on different machines
1087access the same file. Changes made to one cached copy of the file are not made
1088immediately visible to other cached copies. The changes are only made visible
1089to other access sites when a modified version of a file is stored back to the
1090server providing its primary disk storage. Thus, one client's changes may be
1091entirely overwritten by another client's modifications. The situation is
1092further complicated by the possibility that dirty file chunks may be flushed
1093out to the File Server before the file is closed.
1094\par
1095The version of AFS created for the OSF offering extends the current, untyped
1096callback notion to a set of multiple, independent synchronization guarantees.
1097These synchronization tokens allow functionality not offered by AFS-3,
1098including byte-range mandatory locking, exclusive file opens, and read and
1099write privileges over portions of a file.
1100
1101 \section sec5-3 Section 5.3: Improved Name Space Management
1102
1103\par
1104Discovery of new AFS cells and their integration into each existing cell's name
1105space is a completely manual operation in the current system. As the rate of
1106new cell creations increases, the load imposed on system administrators also
1107increases. Also, representing each cell's file space entry as a mount point
1108object in the /afs directory leads to a potential problem. As the number of
1109entries in the /afs directory increase, search time through the directory also
1110grows.
1111\par
1112One improvement to this situation is to implement the top-level /afs directory
1113through a Domain-style database. The database would map cell names to the set
1114of server machines providing authentication and volume location services for
1115that cell. The Cache Manager would query the cell database in the course of
1116pathname resolution, and cache its lookup results.
1117\par
1118In this database-style environment, adding a new cell entry under /afs is
1119accomplished by creating the appropriate database entry. The new cell
1120information is then immediately accessible to all AFS clients.
1121
1122 \section sec5-4 Section 5.4: Read/Write Replication
1123
1124\par
1125The AFS-3 servers and databases are currently equipped to handle read/only
1126replication exclusively. However, other distributed file systems have
1127demonstrated the feasibility of providing full read/write replication of data
1128in environments very similar to AFS [11]. Such systems can serve as models for
1129the set of required changes.
1130
1131 \section sec5-5 Section 5.5: Disconnected Operation
1132
1133\par
1134Several facilities are provided by AFS so that server failures and network
1135partitions may be completely or partially masked. However, AFS does not provide
1136for completely disconnected operation of file system clients. Disconnected
1137operation is a mode in which a client continues to access critical data during
1138accidental or intentional inability to access the shared file repository. After
1139some period of autonomous operation on the set of cached files, the client
1140reconnects with the repository and resynchronizes the contents of its cache
1141with the shared store.
1142\par
1143Studies of related systems provide evidence that such disconnected operation is
1144feasible [11] [12]. Such a capability may be explored for AFS.
1145
1146 \section sec5-6 Section 5.6: Multiprocessor Support
1147
1148\par
1149The LWP lightweight thread package used by all AFS system processes assumes
1150that individual threads may execute non-preemeptively, and that all other
1151threads are quiescent until control is explicitly relinquished from within the
1152currently active thread. These assumptions conspire to prevent AFS from
1153operating correctly on a multiprocessor platform.
1154\par
1155A solution to this restriction is to restructure the AFS code organization so
1156that the proper locking is performed. Thus, critical sections which were
1157previously only implicitly defined are explicitly specified.
1158
1159 \page biblio Bibliography
1160
1161\li [1] John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols,
1162M. Satyanarayanan, Robert N. Sidebotham, Michael J. West, Scale and Performance
1163in a Distributed File System, ACM Transactions on Computer Systems, Vol. 6, No.
11641, February 1988, pp. 51-81.
1165\li [2] Michael L. Kazar, Synchronization and Caching Issues in the Andrew File
1166System, USENIX Proceedings, Dallas, TX, Winter 1988.
1167\li [3] Alfred Z. Spector, Michael L. Kazar, Uniting File Systems, Unix
1168Review, March 1989,
1169\li [4] Johna Till Johnson, Distributed File System Brings LAN Technology to
1170WANs, Data Communications, November 1990, pp. 66-67.
1171\li [5] Michael Padovano, PADCOM Associates, AFS widens your horizons in
1172distributed computing, Systems Integration, March 1991.
1173\li [6] Steve Lammert, The AFS 3.0 Backup System, LISA IV Conference
1174Proceedings, Colorado Springs, Colorado, October 1990.
1175\li [7] Michael L. Kazar, Bruce W. Leverett, Owen T. Anderson, Vasilis
1176Apostolides, Beth A. Bottos, Sailesh Chutani, Craig F. Everhart, W. Anthony
1177Mason, Shu-Tsui Tu, Edward R. Zayas, DEcorum File System Architectural
1178Overview, USENIX Conference Proceedings, Anaheim, Texas, Summer 1990.
1179\li [8] AFS Drives DCE Selection, Digital Desktop, Vol. 1, No. 6,
1180September 1990.
1181\li [9] Levine, P.H., The Apollo DOMAIN Distributed File System, in NATO ASI
1182Series: Theory and Practice of Distributed Operating Systems, Y. Paker, J-P.
1183Banatre, M. Bozyigit, editors, Springer-Verlag, 1987.
1184\li [10] M.N. Nelson, B.B. Welch, J.K. Ousterhout, Caching in the Sprite
1185Network File System, ACM Transactions on Computer Systems, Vol. 6, No. 1,
1186February 1988.
1187\li [11] James J. Kistler, M. Satyanarayanan, Disconnected Operaton in the Coda
1188File System, CMU School of Computer Science technical report, CMU-CS-91-166, 26
1189July 1991.
1190\li [12] Puneet Kumar, M. Satyanarayanan, Log-Based Directory Resolution
1191in the Coda File System, CMU School of Computer Science internal document, 2
1192July 1991.
1193\li [13] Sun Microsystems, Inc., NFS: Network File System Protocol
1194Specification, RFC 1094, March 1989.
1195\li [14] Sun Microsystems, Inc,. Design and Implementation of the Sun Network
1196File System, USENIX Summer Conference Proceedings, June 1985.
1197\li [15] C.H. Sauer, D.W Johnson, L.K. Loucks, A.A. Shaheen-Gouda, and T.A.
1198Smith, RT PC Distributed Services Overview, Operating Systems Review, Vol. 21,
1199No. 3, July 1987.
1200\li [16] A.P. Rifkin, M.P. Forbes, R.L. Hamilton, M. Sabrio, S. Shah, and
1201K. Yueh, RFS Architectural Overview, Usenix Conference Proceedings, Atlanta,
1202Summer 1986.
1203\li [17] Edward R. Zayas, Administrative Cells: Proposal for Cooperative Andrew
1204File Systems, Information Technology Center internal document, Carnegie Mellon
1205University, 25 June 1987.
1206\li [18] Ed. Zayas, Craig Everhart, Design and Specification of the Cellular
1207Andrew Environment, Information Technology Center, Carnegie Mellon University,
1208CMU-ITC-070, 2 August 1988.
1209\li [19] Kazar, Michael L., Information Technology Center, Carnegie Mellon
1210University. Ubik -A Library For Managing Ubiquitous Data, ITCID, Pittsburgh,
1211PA, Month, 1988.
1212\li [20] Kazar, Michael L., Information Technology Center, Carnegie Mellon
1213University. Quorum Completion, ITCID, Pittsburgh, PA, Month, 1988.
1214\li [21] S. R. Kleinman. Vnodes: An Architecture for Multiple file
1215System Types in Sun UNIX, Conference Proceedings, 1986 Summer Usenix Technical
1216Conference, pp. 238-247, El Toro, CA, 1986.
1217\li [22] S.P. Miller, B.C. Neuman, J.I. Schiller, J.H. Saltzer. Kerberos
1218Authentication and Authorization System, Project Athena Technical Plan, Section
1219E.2.1, M.I.T., December 1987.
1220\li [23] Bill Bryant. Designing an Authentication System: a Dialogue in Four
1221Scenes, Project Athena internal document, M.I.T, draft of 8 February 1988.
1222
1223
1224*/