Commit | Line | Data |
---|---|---|
805e021f CE |
1 | =head1 NAME |
2 | ||
3 | fs_checkservers - Displays the status of server machines | |
4 | ||
5 | =head1 SYNOPSIS | |
6 | ||
7 | =for html | |
8 | <div class="synopsis"> | |
9 | ||
10 | B<fs checkservers> S<<< [B<-cell> <I<cell to check>>] >>> [B<-all>] [B<-fast>] | |
11 | S<<< [B<-interval> <I<seconds between probes>>] >>> [B<-help>] | |
12 | ||
13 | B<fs checks> S<<< [B<-c> <I<cell to check>>] >>> [B<-a>] [B<-f>] | |
14 | S<<< [B<-i> <I<seconds between probes>>] >>> [B<-h>] | |
15 | ||
16 | =for html | |
17 | </div> | |
18 | ||
19 | =head1 DESCRIPTION | |
20 | ||
21 | The B<fs checkservers> command reports whether certain AFS server machines | |
22 | are accessible from the local client machine. The machines belong to one | |
23 | of two classes, and the Cache Manager maintains a list of them in kernel | |
24 | memory: | |
25 | ||
26 | =over 4 | |
27 | ||
28 | =item * | |
29 | ||
30 | The database server machines for every cell listed in the local | |
31 | F</usr/vice/etc/CellServDB file>, plus any machines added to the memory | |
32 | list by the B<fs newcell> command since the last reboot. | |
33 | ||
34 | =item * | |
35 | ||
36 | All file server machines the Cache Manager has recently contacted, and | |
37 | which it probably needs to contact again soon. In most cases, the Cache | |
38 | Manager holds a callback on a file or volume fetched from the machine. | |
39 | ||
40 | =back | |
41 | ||
42 | If the Cache Manager is unable to contact the vlserver process on a | |
43 | database server machine or the B<fileserver> process on a file server | |
44 | machine, it marks the machine as inaccessible. (Actually, if a file server | |
45 | machine is multihomed, the Cache Manager attempts to contact all of the | |
46 | machine's interfaces, and only marks the machine as down if the | |
47 | B<fileserver> fails to reply via any of them.) The Cache Manager then | |
48 | periodically (by default, every three minutes) sends a probe to each | |
49 | marked machine, to see if it is still inaccessible. If a previously | |
50 | inaccessible machine responds, the Cache Manager marks it as accessible | |
51 | and no longer sends the periodic probes to it. | |
52 | ||
53 | The B<fs checkservers> command updates the list of inaccessible machines | |
54 | by having the Cache Manager probe a specified set of them: | |
55 | ||
56 | =over 4 | |
57 | ||
58 | =item * | |
59 | ||
60 | By default, only machines that are marked inaccessible and belong to the | |
61 | local cell (the cell listed in the local F</usr/vice/etc/ThisCell> | |
62 | file). | |
63 | ||
64 | =item * | |
65 | ||
66 | If the B<-cell> argument is included, only machines that are marked | |
67 | inaccessible and belong to the specified cell. | |
68 | ||
69 | =item * | |
70 | ||
71 | If the B<-all> flag is included, all machines marked inaccessible. | |
72 | ||
73 | =back | |
74 | ||
75 | If the B<-fast> flag is included, the Cache Manager does not probe any | |
76 | machines, but instead reports the results of the most recent previous | |
77 | probe. | |
78 | ||
79 | To set the interval between probes rather than produce a list of | |
80 | inaccessible machines, use the B<-interval> argument. The non-default | |
81 | setting persists until the machine reboots; to preserve it across reboots, | |
82 | put the appropriate B<fs checkservers> command in the machine's AFS | |
83 | initialization files. | |
84 | ||
85 | =head1 CAUTIONS | |
86 | ||
87 | The command can take quite a while to complete, if a number of machines do | |
88 | not respond to the Cache Manager's probe. The Cache Manager probes | |
89 | machines sequentially and waits a standard timeout period before marking | |
90 | the machine as unresponsive, to allow for slow network communication. To | |
91 | make the command shell prompt return quickly, put the command in the | |
92 | background. It is harmless to interrupt the command by typing Ctrl-C or | |
93 | another interrupt signal. | |
94 | ||
95 | Note that the Cache Manager probes only server machines marked | |
96 | inaccessible in its memory list. A server machine's absence from the | |
97 | output does not necessarily mean that it is functioning, because it | |
98 | possibly is not included in the memory list at all (if, for example, the | |
99 | Cache Manager has not contacted it recently). For the same reason, the | |
100 | output is likely to vary on different client machines. | |
101 | ||
102 | Unlike most B<fs> commands, the fs checkservers command does not refer to | |
103 | the AFSCELL environment variable. | |
104 | ||
105 | =head1 OPTIONS | |
106 | ||
107 | =over 4 | |
108 | ||
109 | =item B<-cell> <I<cell to check>> | |
110 | ||
111 | Names each cell in which to probe server machines marked as | |
112 | inaccessible. Provide the fully qualified domain name, or a shortened form | |
113 | that disambiguates it from the other cells listed in the local | |
114 | F</usr/vice/etc/CellServDB> file. Combine this argument with the B<-fast> | |
115 | flag if desired, but not with the B<-all> flag. Omit both this argument | |
116 | and the B<-all> flag to probe machines in the local cell only. | |
117 | ||
118 | =item B<-all> | |
119 | ||
120 | Probes all machines in the Cache Manager's memory list that are marked | |
121 | inaccessible. Combine this argument with the B<-fast> flag if desired, but | |
122 | not with the B<-cell> argument. Omit both this flag and the B<-cell> | |
123 | argument to probe machines in the local cell only. | |
124 | ||
125 | =item B<-fast> | |
126 | ||
127 | Displays the Cache Manager's current list of machines that are | |
128 | inaccessible, rather than sending new probes. The output can as old as the | |
129 | current setting of the probe interval (by default three minutes, and | |
130 | maximum ten minutes). | |
131 | ||
132 | =item B<-interval> <I<seconds between probes>> | |
133 | ||
134 | Sets or reports the number of seconds between the Cache Manager's probes | |
135 | to machines in the memory list that are marked inaccessible: | |
136 | ||
137 | =over 4 | |
138 | ||
139 | =item * | |
140 | ||
141 | To set the interval, specify a value from the range between 1 and C<600> | |
142 | (10 minutes); the default is C<180> (three minutes). The issuer must be | |
143 | logged in as the local superuser C<root>. The altered setting persists | |
144 | until again changed with this command, or until the machine reboots, at | |
145 | which time the setting returns to the default. | |
146 | ||
147 | =item * | |
148 | ||
149 | Provide a value of C<0> (zero) to display the current interval setting. No | |
150 | privilege is required. Do not combine this argument with any other. | |
151 | ||
152 | =back | |
153 | ||
154 | =item B<-help> | |
155 | ||
156 | Prints the online help for this command. All other valid options are | |
157 | ignored. | |
158 | ||
159 | =back | |
160 | ||
161 | =head1 OUTPUT | |
162 | ||
163 | If there are no machines marked as inaccessible, or if all of them now | |
164 | respond to the Cache Manager's probe, the output is: | |
165 | ||
166 | All servers are running. | |
167 | ||
168 | Note that this message does not mean that all server machines in each | |
169 | relevant cell are running. The output indicates the status of only those | |
170 | machines that the Cache Manager probes. | |
171 | ||
172 | If a machine fails to respond to the probe within the timeout period, the | |
173 | output begins with the string | |
174 | ||
175 | These servers unavailable due to network or server problems: | |
176 | ||
177 | and lists the hostname of each machine on its own line. The Cache Manager | |
178 | stores machine records by Internet address, so the format of each hostname | |
179 | (uppercase or lowercase letters, or an Internet address in dotted decimal | |
180 | format) depends on how the local cell's name service translates it at the | |
181 | time the command is issued. If a server machine is multihomed, the output | |
182 | lists only one of its interfaces (usually, the currently most preferred | |
183 | one). | |
184 | ||
185 | If the B<-interval> argument is provided with a value between C<1> and | |
186 | C<600>, there is no output. If the value is C<0>, the output reports the | |
187 | probe interval as follows: | |
188 | ||
189 | The current down server probe interval is <interval> secs | |
190 | ||
191 | =head1 EXAMPLES | |
192 | ||
193 | The following command displays the Cache Manager's current list of | |
194 | unresponsive machines in the local cell, rather than probing them | |
195 | again. The output indicates that if there were any machines marked | |
196 | inaccessible, they all responded to the previous probe. | |
197 | ||
198 | % fs checkservers -fast | |
199 | All servers are running. | |
200 | ||
201 | The following example probes machines in the Cache Manager's memory list | |
202 | that belong to the C<example.org> cell: | |
203 | ||
204 | % fs checkservers -cell example.org | |
205 | All servers are running. | |
206 | ||
207 | The following example probes all server machines in the Cache Manager's | |
208 | memory list. It reports that two machines did not respond to the probe. | |
209 | ||
210 | % fs checkservers -all | |
211 | These servers unavailable due to network or server problems: | |
212 | fs1.example.com SV3.EXAMPLE.ORG. | |
213 | ||
214 | =head1 PRIVILEGE REQUIRED | |
215 | ||
216 | To set the probe interval, the issuer must be logged in as the local | |
217 | superuser C<root>. Otherwise, no privilege is required. | |
218 | ||
219 | =head1 SEE ALSO | |
220 | ||
221 | L<CellServDB(5)>, | |
222 | L<ThisCell(5)>, | |
223 | L<fs_newcell(1)> | |
224 | ||
225 | =head1 COPYRIGHT | |
226 | ||
227 | IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved. | |
228 | ||
229 | This documentation is covered by the IBM Public License Version 1.0. It was | |
230 | converted from HTML to POD by software written by Chas Williams and Russ | |
231 | Allbery, based on work by Alf Wachsmann and Elizabeth Cassell. |