Commit | Line | Data |
---|---|---|
805e021f CE |
1 | |
2 | Rx Debug | |
3 | -------- | |
4 | ||
5 | Introduction | |
6 | ============ | |
7 | ||
8 | Rx provides data collections for remote debugging and troubleshooting using UDP | |
9 | packets. This document provides details on the protocol, data formats, and | |
10 | the data format versions. | |
11 | ||
12 | ||
13 | Protocol | |
14 | ======== | |
15 | ||
16 | A simple request/response protocol is used to request this information | |
17 | from an Rx instance. Request and response packets contain an Rx header but | |
18 | only a subset of the header fields are used, since the debugging packages are | |
19 | not part of the Rx RPC protocol. | |
20 | ||
21 | The protocol is simple. A client sends an Rx DEBUG (8) packet to an | |
22 | address:port of an active Rx instance. This request contains an arbitrary | |
23 | request number in the callNumber field of the Rx header (reused here since | |
24 | DEBUG packets are never used in RPCs). The payload of the request is simply a | |
25 | pair 32 bit integers in network byte order. The first integer indicates the | |
26 | which data collection type is requested. The second integer indicates which | |
27 | record number of the data type requested, for data types which have multiple | |
28 | records, such as the rx connections and rx peers. The request packet must have | |
29 | the CLIENT-INITIATED flag set in the Rx header. | |
30 | ||
31 | Rx responds with a single Rx DEBUG (8) packet, the payload of which contains | |
32 | the data record for the type and index requested. The callNumber in the Rx | |
33 | header contains the same number as the value of the request, allowing the | |
34 | client to match responses to requests. The response DEBUG packet does not | |
35 | contain the request type and index parameters. | |
36 | ||
37 | The first 32-bits, in network byte order, of the response payload indicate | |
38 | error conditions: | |
39 | ||
40 | * 0xFFFFFFFF (-1) index is out of range | |
41 | * 0xFFFFFFF8 (-8) unknown request type | |
42 | ||
43 | ||
44 | Data Collection Types | |
45 | ===================== | |
46 | ||
47 | OpenAFS defines 5 types of data collections which may be | |
48 | requested: | |
49 | ||
50 | 1 GETSTATS Basic Rx statistics (struct rx_debugStats) | |
51 | 2 GETCONN Active connections [indexed] (struct rx_debugConn) | |
52 | 3 GETALLCONN All connections [indexed] (struct rx_debugConn) | |
53 | 4 RXSTATS Detailed Rx statistics (struct rx_statistics) | |
54 | 5 GETPEER Rx peer info [indexed] (struct rx_peerDebug) | |
55 | ||
56 | The format of the response data for each type is given below. XDR is | |
57 | not used. All integers are in network byte order. | |
58 | ||
59 | In a typical exchange, a client will request the "basic Rx stats" data first. | |
60 | This contains a data layout version number (detailed in the next section). | |
61 | ||
62 | Types GETCONN (2), GETALLCONN (3), and GETPEER (5), are array-like data | |
63 | collections. The index field is used to retrieve each record, one per packet. | |
64 | The first record is index 0. The client may request each record, starting with | |
65 | zero, and incremented by one on each request packet, until the Rx service | |
66 | returns -1 (out of range). No provisions are made for locking the data | |
67 | collections between requests, as this is intended only to be a debugging | |
68 | interface. | |
69 | ||
70 | ||
71 | Data Collection Versions | |
72 | ======================== | |
73 | ||
74 | Every Rx service has a single byte wide debugging version id, which is set at | |
75 | build time. This version id allows clients to properly interpret the response | |
76 | data formats for the various data types. The version id is present in the | |
77 | basic Rx statistics (type 1) response data. | |
78 | ||
79 | The first usable version is 'L', which was present in early Transarc/IBM AFS. | |
80 | The first version in OpenAFS was 'Q', and versions after 'Q' are OpenAFS | |
81 | specific extensions. The current version for OpenAFS is 'S'. | |
82 | ||
83 | Historically, the version id has been incremented when a new debug data type is | |
84 | added or changed. The version history is summarized in the following table: | |
85 | ||
86 | 'L' - Earliest usable version | |
87 | - GETSTATS (1) supported | |
88 | - GETCONNS (2) supported (with obsolete format rx_debugConn_vL) | |
89 | - Added connection object security stats (rx_securityObjectStats) to GETCONNS (2) | |
90 | - Transarc/IBM AFS | |
91 | ||
92 | 'M' - Added GETALLCONN (3) data type | |
93 | - Added RXSTATS (4) data type | |
94 | - Transarc/IBM AFS | |
95 | ||
96 | 'N' - Added calls waiting for a thread count (nWaiting) to GETSTATS (1) | |
97 | - Transarc/IBM AFS | |
98 | ||
99 | 'O' - Added number of idle threads count (idleThreads) to GETSTATS (1) | |
100 | - Transarc/IBM AFS | |
101 | ||
102 | 'P' - Added cbuf packet allocation failure counts (receiveCbufPktAllocFailures | |
103 | and sendCbufPktAllocFailures) to RXSTATS (4) | |
104 | - Transarc/IBM AFS | |
105 | ||
106 | 'Q' - Added GETPEER (5) data type | |
107 | - Transarc/IBM AFS | |
108 | - OpenAFS 1.0 | |
109 | ||
110 | (?) - Added number of busy aborts sent (nBusies) to RXSTATS (4) | |
111 | - rxdebug was not changed to display this new count | |
112 | - OpenAFS 1.4.0 | |
113 | ||
114 | 'R' - Added total calls which waited for a thread (nWaited) to GETSTATS (1) | |
115 | - OpenAFS 1.5.0 (devel) | |
116 | - OpenAFS 1.6.0 (stable) | |
117 | ||
118 | 'S' - Added total packets allocated (nPackets) to GETSTATS (1) | |
119 | - OpenAFS 1.5.53 (devel) | |
120 | - OpenAFS 1.6.0 (stable) | |
121 | ||
122 | ||
123 | ||
124 | Debug Request Parameters | |
125 | ======================== | |
126 | ||
127 | The payload of DEBUG request packets is two 32 bit integers | |
128 | in network byte order. | |
129 | ||
130 | ||
131 | struct rx_debugIn { | |
132 | afs_int32 type; /* requested type; range 1..5 */ | |
133 | afs_int32 index; /* record number: 0 .. n */ | |
134 | }; | |
135 | ||
136 | The index field should be set to 0 when type is GETSTAT (1) and RXSTATS (4). | |
137 | ||
138 | ||
139 | ||
140 | GETSTATS (1) | |
141 | ============ | |
142 | ||
143 | GETSTATS returns basic Rx performance statistics and the overall debug | |
144 | version id. | |
145 | ||
146 | struct rx_debugStats { | |
147 | afs_int32 nFreePackets; | |
148 | afs_int32 packetReclaims; | |
149 | afs_int32 callsExecuted; | |
150 | char waitingForPackets; | |
151 | char usedFDs; | |
152 | char version; | |
153 | char spare1; | |
154 | afs_int32 nWaiting; /* Version 'N': number of calls waiting for a thread */ | |
155 | afs_int32 idleThreads; /* Version 'O': number of server threads that are idle */ | |
156 | afs_int32 nWaited; /* Version 'R': total calls waited */ | |
157 | afs_int32 nPackets; /* Version 'S': total packets allocated */ | |
158 | afs_int32 spare2[6]; | |
159 | }; | |
160 | ||
161 | ||
162 | GETCONN (2) and GETALLCONN (3) | |
163 | ============================== | |
164 | ||
165 | GETCONN (2) returns an active connection information record, for the | |
166 | given index. | |
167 | ||
168 | GETALLCONN (3) returns a connection information record, active or not, | |
169 | for the given index. The GETALLCONN (3) data type was added in | |
170 | version 'M'. | |
171 | ||
172 | The data format is the same for GETCONN (2) and GETALLCONN (3), and is | |
173 | as follows: | |
174 | ||
175 | struct rx_debugConn { | |
176 | afs_uint32 host; | |
177 | afs_int32 cid; | |
178 | afs_int32 serial; | |
179 | afs_int32 callNumber[RX_MAXCALLS]; | |
180 | afs_int32 error; | |
181 | short port; | |
182 | char flags; | |
183 | char type; | |
184 | char securityIndex; | |
185 | char sparec[3]; /* force correct alignment */ | |
186 | char callState[RX_MAXCALLS]; | |
187 | char callMode[RX_MAXCALLS]; | |
188 | char callFlags[RX_MAXCALLS]; | |
189 | char callOther[RX_MAXCALLS]; | |
190 | /* old style getconn stops here */ | |
191 | struct rx_securityObjectStats secStats; | |
192 | afs_int32 epoch; | |
193 | afs_int32 natMTU; | |
194 | afs_int32 sparel[9]; | |
195 | }; | |
196 | ||
197 | ||
198 | An obsolete layout, which exhibited a problem with data alignment, was used in | |
199 | Version 'L'. This is defined as: | |
200 | ||
201 | struct rx_debugConn_vL { | |
202 | afs_uint32 host; | |
203 | afs_int32 cid; | |
204 | afs_int32 serial; | |
205 | afs_int32 callNumber[RX_MAXCALLS]; | |
206 | afs_int32 error; | |
207 | short port; | |
208 | char flags; | |
209 | char type; | |
210 | char securityIndex; | |
211 | char callState[RX_MAXCALLS]; | |
212 | char callMode[RX_MAXCALLS]; | |
213 | char callFlags[RX_MAXCALLS]; | |
214 | char callOther[RX_MAXCALLS]; | |
215 | /* old style getconn stops here */ | |
216 | struct rx_securityObjectStats secStats; | |
217 | afs_int32 sparel[10]; | |
218 | }; | |
219 | ||
220 | ||
221 | The layout of the secStats field is as follows: | |
222 | ||
223 | struct rx_securityObjectStats { | |
224 | char type; /* 0:unk 1:null,2:vab 3:kad */ | |
225 | char level; | |
226 | char sparec[10]; /* force correct alignment */ | |
227 | afs_int32 flags; /* 1=>unalloc, 2=>auth, 4=>expired */ | |
228 | afs_uint32 expires; | |
229 | afs_uint32 packetsReceived; | |
230 | afs_uint32 packetsSent; | |
231 | afs_uint32 bytesReceived; | |
232 | afs_uint32 bytesSent; | |
233 | short spares[4]; | |
234 | afs_int32 sparel[8]; | |
235 | }; | |
236 | ||
237 | ||
238 | ||
239 | RXSTATS (4) | |
240 | =========== | |
241 | ||
242 | RXSTATS (4) returns general rx statistics. Every member of the returned | |
243 | structure is a 32 bit integer in network byte order. The assumption is made | |
244 | sizeof(int) is equal to sizeof(afs_int32). | |
245 | ||
246 | The RXSTATS (4) data type was added in Version 'M'. | |
247 | ||
248 | ||
249 | struct rx_statistics { /* General rx statistics */ | |
250 | int packetRequests; /* Number of packet allocation requests */ | |
251 | int receivePktAllocFailures; | |
252 | int sendPktAllocFailures; | |
253 | int specialPktAllocFailures; | |
254 | int socketGreedy; /* Whether SO_GREEDY succeeded */ | |
255 | int bogusPacketOnRead; /* Number of inappropriately short packets received */ | |
256 | int bogusHost; /* Host address from bogus packets */ | |
257 | int noPacketOnRead; /* Number of read packets attempted when there was actually no packet to read off the wire */ | |
258 | int noPacketBuffersOnRead; /* Number of dropped data packets due to lack of packet buffers */ | |
259 | int selects; /* Number of selects waiting for packet or timeout */ | |
260 | int sendSelects; /* Number of selects forced when sending packet */ | |
261 | int packetsRead[RX_N_PACKET_TYPES]; /* Total number of packets read, per type */ | |
262 | int dataPacketsRead; /* Number of unique data packets read off the wire */ | |
263 | int ackPacketsRead; /* Number of ack packets read */ | |
264 | int dupPacketsRead; /* Number of duplicate data packets read */ | |
265 | int spuriousPacketsRead; /* Number of inappropriate data packets */ | |
266 | int packetsSent[RX_N_PACKET_TYPES]; /* Number of rxi_Sends: packets sent over the wire, per type */ | |
267 | int ackPacketsSent; /* Number of acks sent */ | |
268 | int pingPacketsSent; /* Total number of ping packets sent */ | |
269 | int abortPacketsSent; /* Total number of aborts */ | |
270 | int busyPacketsSent; /* Total number of busies sent received */ | |
271 | int dataPacketsSent; /* Number of unique data packets sent */ | |
272 | int dataPacketsReSent; /* Number of retransmissions */ | |
273 | int dataPacketsPushed; /* Number of retransmissions pushed early by a NACK */ | |
274 | int ignoreAckedPacket; /* Number of packets with acked flag, on rxi_Start */ | |
275 | struct clock totalRtt; /* Total round trip time measured (use to compute average) */ | |
276 | struct clock minRtt; /* Minimum round trip time measured */ | |
277 | struct clock maxRtt; /* Maximum round trip time measured */ | |
278 | int nRttSamples; /* Total number of round trip samples */ | |
279 | int nServerConns; /* Total number of server connections */ | |
280 | int nClientConns; /* Total number of client connections */ | |
281 | int nPeerStructs; /* Total number of peer structures */ | |
282 | int nCallStructs; /* Total number of call structures allocated */ | |
283 | int nFreeCallStructs; /* Total number of previously allocated free call structures */ | |
284 | int netSendFailures; | |
285 | afs_int32 fatalErrors; | |
286 | int ignorePacketDally; /* packets dropped because call is in dally state */ | |
287 | int receiveCbufPktAllocFailures; /* Version 'P': receive cbuf packet alloc failures */ | |
288 | int sendCbufPktAllocFailures; /* Version 'P': send cbuf packet alloc failures */ | |
289 | int nBusies; /* Version 'R': number of busy aborts sent */ | |
290 | int spares[4]; | |
291 | }; | |
292 | ||
293 | ||
294 | GETPEER (5) | |
295 | =========== | |
296 | ||
297 | GETPEER (5) returns a peer information record, for the given index. | |
298 | ||
299 | struct rx_debugPeer { | |
300 | afs_uint32 host; | |
301 | u_short port; | |
302 | u_short ifMTU; | |
303 | afs_uint32 idleWhen; | |
304 | short refCount; | |
305 | u_char burstSize; | |
306 | u_char burst; | |
307 | struct clock burstWait; | |
308 | afs_int32 rtt; | |
309 | afs_int32 rtt_dev; | |
310 | struct clock timeout; | |
311 | afs_int32 nSent; | |
312 | afs_int32 reSends; | |
313 | afs_int32 inPacketSkew; | |
314 | afs_int32 outPacketSkew; | |
315 | afs_int32 rateFlag; | |
316 | u_short natMTU; | |
317 | u_short maxMTU; | |
318 | u_short maxDgramPackets; | |
319 | u_short ifDgramPackets; | |
320 | u_short MTU; | |
321 | u_short cwind; | |
322 | u_short nDgramPackets; | |
323 | u_short congestSeq; | |
324 | afs_hyper_t bytesSent; | |
325 | afs_hyper_t bytesReceived; | |
326 | afs_int32 sparel[10]; | |
327 | }; | |
328 | ||
329 |