| 1 | =head1 NAME |
| 2 | |
| 3 | fs_checkservers - Displays the status of server machines |
| 4 | |
| 5 | =head1 SYNOPSIS |
| 6 | |
| 7 | =for html |
| 8 | <div class="synopsis"> |
| 9 | |
| 10 | B<fs checkservers> S<<< [B<-cell> <I<cell to check>>] >>> [B<-all>] [B<-fast>] |
| 11 | S<<< [B<-interval> <I<seconds between probes>>] >>> [B<-help>] |
| 12 | |
| 13 | B<fs checks> S<<< [B<-c> <I<cell to check>>] >>> [B<-a>] [B<-f>] |
| 14 | S<<< [B<-i> <I<seconds between probes>>] >>> [B<-h>] |
| 15 | |
| 16 | =for html |
| 17 | </div> |
| 18 | |
| 19 | =head1 DESCRIPTION |
| 20 | |
| 21 | The B<fs checkservers> command reports whether certain AFS server machines |
| 22 | are accessible from the local client machine. The machines belong to one |
| 23 | of two classes, and the Cache Manager maintains a list of them in kernel |
| 24 | memory: |
| 25 | |
| 26 | =over 4 |
| 27 | |
| 28 | =item * |
| 29 | |
| 30 | The database server machines for every cell listed in the local |
| 31 | F</usr/vice/etc/CellServDB file>, plus any machines added to the memory |
| 32 | list by the B<fs newcell> command since the last reboot. |
| 33 | |
| 34 | =item * |
| 35 | |
| 36 | All file server machines the Cache Manager has recently contacted, and |
| 37 | which it probably needs to contact again soon. In most cases, the Cache |
| 38 | Manager holds a callback on a file or volume fetched from the machine. |
| 39 | |
| 40 | =back |
| 41 | |
| 42 | If the Cache Manager is unable to contact the vlserver process on a |
| 43 | database server machine or the B<fileserver> process on a file server |
| 44 | machine, it marks the machine as inaccessible. (Actually, if a file server |
| 45 | machine is multihomed, the Cache Manager attempts to contact all of the |
| 46 | machine's interfaces, and only marks the machine as down if the |
| 47 | B<fileserver> fails to reply via any of them.) The Cache Manager then |
| 48 | periodically (by default, every three minutes) sends a probe to each |
| 49 | marked machine, to see if it is still inaccessible. If a previously |
| 50 | inaccessible machine responds, the Cache Manager marks it as accessible |
| 51 | and no longer sends the periodic probes to it. |
| 52 | |
| 53 | The B<fs checkservers> command updates the list of inaccessible machines |
| 54 | by having the Cache Manager probe a specified set of them: |
| 55 | |
| 56 | =over 4 |
| 57 | |
| 58 | =item * |
| 59 | |
| 60 | By default, only machines that are marked inaccessible and belong to the |
| 61 | local cell (the cell listed in the local F</usr/vice/etc/ThisCell> |
| 62 | file). |
| 63 | |
| 64 | =item * |
| 65 | |
| 66 | If the B<-cell> argument is included, only machines that are marked |
| 67 | inaccessible and belong to the specified cell. |
| 68 | |
| 69 | =item * |
| 70 | |
| 71 | If the B<-all> flag is included, all machines marked inaccessible. |
| 72 | |
| 73 | =back |
| 74 | |
| 75 | If the B<-fast> flag is included, the Cache Manager does not probe any |
| 76 | machines, but instead reports the results of the most recent previous |
| 77 | probe. |
| 78 | |
| 79 | To set the interval between probes rather than produce a list of |
| 80 | inaccessible machines, use the B<-interval> argument. The non-default |
| 81 | setting persists until the machine reboots; to preserve it across reboots, |
| 82 | put the appropriate B<fs checkservers> command in the machine's AFS |
| 83 | initialization files. |
| 84 | |
| 85 | =head1 CAUTIONS |
| 86 | |
| 87 | The command can take quite a while to complete, if a number of machines do |
| 88 | not respond to the Cache Manager's probe. The Cache Manager probes |
| 89 | machines sequentially and waits a standard timeout period before marking |
| 90 | the machine as unresponsive, to allow for slow network communication. To |
| 91 | make the command shell prompt return quickly, put the command in the |
| 92 | background. It is harmless to interrupt the command by typing Ctrl-C or |
| 93 | another interrupt signal. |
| 94 | |
| 95 | Note that the Cache Manager probes only server machines marked |
| 96 | inaccessible in its memory list. A server machine's absence from the |
| 97 | output does not necessarily mean that it is functioning, because it |
| 98 | possibly is not included in the memory list at all (if, for example, the |
| 99 | Cache Manager has not contacted it recently). For the same reason, the |
| 100 | output is likely to vary on different client machines. |
| 101 | |
| 102 | Unlike most B<fs> commands, the fs checkservers command does not refer to |
| 103 | the AFSCELL environment variable. |
| 104 | |
| 105 | =head1 OPTIONS |
| 106 | |
| 107 | =over 4 |
| 108 | |
| 109 | =item B<-cell> <I<cell to check>> |
| 110 | |
| 111 | Names each cell in which to probe server machines marked as |
| 112 | inaccessible. Provide the fully qualified domain name, or a shortened form |
| 113 | that disambiguates it from the other cells listed in the local |
| 114 | F</usr/vice/etc/CellServDB> file. Combine this argument with the B<-fast> |
| 115 | flag if desired, but not with the B<-all> flag. Omit both this argument |
| 116 | and the B<-all> flag to probe machines in the local cell only. |
| 117 | |
| 118 | =item B<-all> |
| 119 | |
| 120 | Probes all machines in the Cache Manager's memory list that are marked |
| 121 | inaccessible. Combine this argument with the B<-fast> flag if desired, but |
| 122 | not with the B<-cell> argument. Omit both this flag and the B<-cell> |
| 123 | argument to probe machines in the local cell only. |
| 124 | |
| 125 | =item B<-fast> |
| 126 | |
| 127 | Displays the Cache Manager's current list of machines that are |
| 128 | inaccessible, rather than sending new probes. The output can as old as the |
| 129 | current setting of the probe interval (by default three minutes, and |
| 130 | maximum ten minutes). |
| 131 | |
| 132 | =item B<-interval> <I<seconds between probes>> |
| 133 | |
| 134 | Sets or reports the number of seconds between the Cache Manager's probes |
| 135 | to machines in the memory list that are marked inaccessible: |
| 136 | |
| 137 | =over 4 |
| 138 | |
| 139 | =item * |
| 140 | |
| 141 | To set the interval, specify a value from the range between 1 and C<600> |
| 142 | (10 minutes); the default is C<180> (three minutes). The issuer must be |
| 143 | logged in as the local superuser C<root>. The altered setting persists |
| 144 | until again changed with this command, or until the machine reboots, at |
| 145 | which time the setting returns to the default. |
| 146 | |
| 147 | =item * |
| 148 | |
| 149 | Provide a value of C<0> (zero) to display the current interval setting. No |
| 150 | privilege is required. Do not combine this argument with any other. |
| 151 | |
| 152 | =back |
| 153 | |
| 154 | =item B<-help> |
| 155 | |
| 156 | Prints the online help for this command. All other valid options are |
| 157 | ignored. |
| 158 | |
| 159 | =back |
| 160 | |
| 161 | =head1 OUTPUT |
| 162 | |
| 163 | If there are no machines marked as inaccessible, or if all of them now |
| 164 | respond to the Cache Manager's probe, the output is: |
| 165 | |
| 166 | All servers are running. |
| 167 | |
| 168 | Note that this message does not mean that all server machines in each |
| 169 | relevant cell are running. The output indicates the status of only those |
| 170 | machines that the Cache Manager probes. |
| 171 | |
| 172 | If a machine fails to respond to the probe within the timeout period, the |
| 173 | output begins with the string |
| 174 | |
| 175 | These servers unavailable due to network or server problems: |
| 176 | |
| 177 | and lists the hostname of each machine on its own line. The Cache Manager |
| 178 | stores machine records by Internet address, so the format of each hostname |
| 179 | (uppercase or lowercase letters, or an Internet address in dotted decimal |
| 180 | format) depends on how the local cell's name service translates it at the |
| 181 | time the command is issued. If a server machine is multihomed, the output |
| 182 | lists only one of its interfaces (usually, the currently most preferred |
| 183 | one). |
| 184 | |
| 185 | If the B<-interval> argument is provided with a value between C<1> and |
| 186 | C<600>, there is no output. If the value is C<0>, the output reports the |
| 187 | probe interval as follows: |
| 188 | |
| 189 | The current down server probe interval is <interval> secs |
| 190 | |
| 191 | =head1 EXAMPLES |
| 192 | |
| 193 | The following command displays the Cache Manager's current list of |
| 194 | unresponsive machines in the local cell, rather than probing them |
| 195 | again. The output indicates that if there were any machines marked |
| 196 | inaccessible, they all responded to the previous probe. |
| 197 | |
| 198 | % fs checkservers -fast |
| 199 | All servers are running. |
| 200 | |
| 201 | The following example probes machines in the Cache Manager's memory list |
| 202 | that belong to the C<example.org> cell: |
| 203 | |
| 204 | % fs checkservers -cell example.org |
| 205 | All servers are running. |
| 206 | |
| 207 | The following example probes all server machines in the Cache Manager's |
| 208 | memory list. It reports that two machines did not respond to the probe. |
| 209 | |
| 210 | % fs checkservers -all |
| 211 | These servers unavailable due to network or server problems: |
| 212 | fs1.example.com SV3.EXAMPLE.ORG. |
| 213 | |
| 214 | =head1 PRIVILEGE REQUIRED |
| 215 | |
| 216 | To set the probe interval, the issuer must be logged in as the local |
| 217 | superuser C<root>. Otherwise, no privilege is required. |
| 218 | |
| 219 | =head1 SEE ALSO |
| 220 | |
| 221 | L<CellServDB(5)>, |
| 222 | L<ThisCell(5)>, |
| 223 | L<fs_newcell(1)> |
| 224 | |
| 225 | =head1 COPYRIGHT |
| 226 | |
| 227 | IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved. |
| 228 | |
| 229 | This documentation is covered by the IBM Public License Version 1.0. It was |
| 230 | converted from HTML to POD by software written by Chas Williams and Russ |
| 231 | Allbery, based on work by Alf Wachsmann and Elizabeth Cassell. |