Import Upstream version 1.8.5
[hcoop/debian/openafs.git] / doc / txt / rx-debug.txt
CommitLineData
805e021f
CE
1
2 Rx Debug
3 --------
4
5Introduction
6============
7
8Rx provides data collections for remote debugging and troubleshooting using UDP
9packets. This document provides details on the protocol, data formats, and
10the data format versions.
11
12
13Protocol
14========
15
16A simple request/response protocol is used to request this information
17from an Rx instance. Request and response packets contain an Rx header but
18only a subset of the header fields are used, since the debugging packages are
19not part of the Rx RPC protocol.
20
21The protocol is simple. A client sends an Rx DEBUG (8) packet to an
22address:port of an active Rx instance. This request contains an arbitrary
23request number in the callNumber field of the Rx header (reused here since
24DEBUG packets are never used in RPCs). The payload of the request is simply a
25pair 32 bit integers in network byte order. The first integer indicates the
26which data collection type is requested. The second integer indicates which
27record number of the data type requested, for data types which have multiple
28records, such as the rx connections and rx peers. The request packet must have
29the CLIENT-INITIATED flag set in the Rx header.
30
31Rx responds with a single Rx DEBUG (8) packet, the payload of which contains
32the data record for the type and index requested. The callNumber in the Rx
33header contains the same number as the value of the request, allowing the
34client to match responses to requests. The response DEBUG packet does not
35contain the request type and index parameters.
36
37The first 32-bits, in network byte order, of the response payload indicate
38error conditions:
39
40* 0xFFFFFFFF (-1) index is out of range
41* 0xFFFFFFF8 (-8) unknown request type
42
43
44Data Collection Types
45=====================
46
47OpenAFS defines 5 types of data collections which may be
48requested:
49
50 1 GETSTATS Basic Rx statistics (struct rx_debugStats)
51 2 GETCONN Active connections [indexed] (struct rx_debugConn)
52 3 GETALLCONN All connections [indexed] (struct rx_debugConn)
53 4 RXSTATS Detailed Rx statistics (struct rx_statistics)
54 5 GETPEER Rx peer info [indexed] (struct rx_peerDebug)
55
56The format of the response data for each type is given below. XDR is
57not used. All integers are in network byte order.
58
59In a typical exchange, a client will request the "basic Rx stats" data first.
60This contains a data layout version number (detailed in the next section).
61
62Types GETCONN (2), GETALLCONN (3), and GETPEER (5), are array-like data
63collections. The index field is used to retrieve each record, one per packet.
64The first record is index 0. The client may request each record, starting with
65zero, and incremented by one on each request packet, until the Rx service
66returns -1 (out of range). No provisions are made for locking the data
67collections between requests, as this is intended only to be a debugging
68interface.
69
70
71Data Collection Versions
72========================
73
74Every Rx service has a single byte wide debugging version id, which is set at
75build time. This version id allows clients to properly interpret the response
76data formats for the various data types. The version id is present in the
77basic Rx statistics (type 1) response data.
78
79The first usable version is 'L', which was present in early Transarc/IBM AFS.
80The first version in OpenAFS was 'Q', and versions after 'Q' are OpenAFS
81specific extensions. The current version for OpenAFS is 'S'.
82
83Historically, the version id has been incremented when a new debug data type is
84added or changed. The version history is summarized in the following table:
85
86 'L' - Earliest usable version
87 - GETSTATS (1) supported
88 - GETCONNS (2) supported (with obsolete format rx_debugConn_vL)
89 - Added connection object security stats (rx_securityObjectStats) to GETCONNS (2)
90 - Transarc/IBM AFS
91
92 'M' - Added GETALLCONN (3) data type
93 - Added RXSTATS (4) data type
94 - Transarc/IBM AFS
95
96 'N' - Added calls waiting for a thread count (nWaiting) to GETSTATS (1)
97 - Transarc/IBM AFS
98
99 'O' - Added number of idle threads count (idleThreads) to GETSTATS (1)
100 - Transarc/IBM AFS
101
102 'P' - Added cbuf packet allocation failure counts (receiveCbufPktAllocFailures
103 and sendCbufPktAllocFailures) to RXSTATS (4)
104 - Transarc/IBM AFS
105
106 'Q' - Added GETPEER (5) data type
107 - Transarc/IBM AFS
108 - OpenAFS 1.0
109
110 (?) - Added number of busy aborts sent (nBusies) to RXSTATS (4)
111 - rxdebug was not changed to display this new count
112 - OpenAFS 1.4.0
113
114 'R' - Added total calls which waited for a thread (nWaited) to GETSTATS (1)
115 - OpenAFS 1.5.0 (devel)
116 - OpenAFS 1.6.0 (stable)
117
118 'S' - Added total packets allocated (nPackets) to GETSTATS (1)
119 - OpenAFS 1.5.53 (devel)
120 - OpenAFS 1.6.0 (stable)
121
122
123
124Debug Request Parameters
125========================
126
127The payload of DEBUG request packets is two 32 bit integers
128in network byte order.
129
130
131 struct rx_debugIn {
132 afs_int32 type; /* requested type; range 1..5 */
133 afs_int32 index; /* record number: 0 .. n */
134 };
135
136The index field should be set to 0 when type is GETSTAT (1) and RXSTATS (4).
137
138
139
140GETSTATS (1)
141============
142
143GETSTATS returns basic Rx performance statistics and the overall debug
144version id.
145
146 struct rx_debugStats {
147 afs_int32 nFreePackets;
148 afs_int32 packetReclaims;
149 afs_int32 callsExecuted;
150 char waitingForPackets;
151 char usedFDs;
152 char version;
153 char spare1;
154 afs_int32 nWaiting; /* Version 'N': number of calls waiting for a thread */
155 afs_int32 idleThreads; /* Version 'O': number of server threads that are idle */
156 afs_int32 nWaited; /* Version 'R': total calls waited */
157 afs_int32 nPackets; /* Version 'S': total packets allocated */
158 afs_int32 spare2[6];
159 };
160
161
162GETCONN (2) and GETALLCONN (3)
163==============================
164
165GETCONN (2) returns an active connection information record, for the
166given index.
167
168GETALLCONN (3) returns a connection information record, active or not,
169for the given index. The GETALLCONN (3) data type was added in
170version 'M'.
171
172The data format is the same for GETCONN (2) and GETALLCONN (3), and is
173as follows:
174
175 struct rx_debugConn {
176 afs_uint32 host;
177 afs_int32 cid;
178 afs_int32 serial;
179 afs_int32 callNumber[RX_MAXCALLS];
180 afs_int32 error;
181 short port;
182 char flags;
183 char type;
184 char securityIndex;
185 char sparec[3]; /* force correct alignment */
186 char callState[RX_MAXCALLS];
187 char callMode[RX_MAXCALLS];
188 char callFlags[RX_MAXCALLS];
189 char callOther[RX_MAXCALLS];
190 /* old style getconn stops here */
191 struct rx_securityObjectStats secStats;
192 afs_int32 epoch;
193 afs_int32 natMTU;
194 afs_int32 sparel[9];
195 };
196
197
198An obsolete layout, which exhibited a problem with data alignment, was used in
199Version 'L'. This is defined as:
200
201 struct rx_debugConn_vL {
202 afs_uint32 host;
203 afs_int32 cid;
204 afs_int32 serial;
205 afs_int32 callNumber[RX_MAXCALLS];
206 afs_int32 error;
207 short port;
208 char flags;
209 char type;
210 char securityIndex;
211 char callState[RX_MAXCALLS];
212 char callMode[RX_MAXCALLS];
213 char callFlags[RX_MAXCALLS];
214 char callOther[RX_MAXCALLS];
215 /* old style getconn stops here */
216 struct rx_securityObjectStats secStats;
217 afs_int32 sparel[10];
218 };
219
220
221The layout of the secStats field is as follows:
222
223 struct rx_securityObjectStats {
224 char type; /* 0:unk 1:null,2:vab 3:kad */
225 char level;
226 char sparec[10]; /* force correct alignment */
227 afs_int32 flags; /* 1=>unalloc, 2=>auth, 4=>expired */
228 afs_uint32 expires;
229 afs_uint32 packetsReceived;
230 afs_uint32 packetsSent;
231 afs_uint32 bytesReceived;
232 afs_uint32 bytesSent;
233 short spares[4];
234 afs_int32 sparel[8];
235 };
236
237
238
239RXSTATS (4)
240===========
241
242RXSTATS (4) returns general rx statistics. Every member of the returned
243structure is a 32 bit integer in network byte order. The assumption is made
244sizeof(int) is equal to sizeof(afs_int32).
245
246The RXSTATS (4) data type was added in Version 'M'.
247
248
249 struct rx_statistics { /* General rx statistics */
250 int packetRequests; /* Number of packet allocation requests */
251 int receivePktAllocFailures;
252 int sendPktAllocFailures;
253 int specialPktAllocFailures;
254 int socketGreedy; /* Whether SO_GREEDY succeeded */
255 int bogusPacketOnRead; /* Number of inappropriately short packets received */
256 int bogusHost; /* Host address from bogus packets */
257 int noPacketOnRead; /* Number of read packets attempted when there was actually no packet to read off the wire */
258 int noPacketBuffersOnRead; /* Number of dropped data packets due to lack of packet buffers */
259 int selects; /* Number of selects waiting for packet or timeout */
260 int sendSelects; /* Number of selects forced when sending packet */
261 int packetsRead[RX_N_PACKET_TYPES]; /* Total number of packets read, per type */
262 int dataPacketsRead; /* Number of unique data packets read off the wire */
263 int ackPacketsRead; /* Number of ack packets read */
264 int dupPacketsRead; /* Number of duplicate data packets read */
265 int spuriousPacketsRead; /* Number of inappropriate data packets */
266 int packetsSent[RX_N_PACKET_TYPES]; /* Number of rxi_Sends: packets sent over the wire, per type */
267 int ackPacketsSent; /* Number of acks sent */
268 int pingPacketsSent; /* Total number of ping packets sent */
269 int abortPacketsSent; /* Total number of aborts */
270 int busyPacketsSent; /* Total number of busies sent received */
271 int dataPacketsSent; /* Number of unique data packets sent */
272 int dataPacketsReSent; /* Number of retransmissions */
273 int dataPacketsPushed; /* Number of retransmissions pushed early by a NACK */
274 int ignoreAckedPacket; /* Number of packets with acked flag, on rxi_Start */
275 struct clock totalRtt; /* Total round trip time measured (use to compute average) */
276 struct clock minRtt; /* Minimum round trip time measured */
277 struct clock maxRtt; /* Maximum round trip time measured */
278 int nRttSamples; /* Total number of round trip samples */
279 int nServerConns; /* Total number of server connections */
280 int nClientConns; /* Total number of client connections */
281 int nPeerStructs; /* Total number of peer structures */
282 int nCallStructs; /* Total number of call structures allocated */
283 int nFreeCallStructs; /* Total number of previously allocated free call structures */
284 int netSendFailures;
285 afs_int32 fatalErrors;
286 int ignorePacketDally; /* packets dropped because call is in dally state */
287 int receiveCbufPktAllocFailures; /* Version 'P': receive cbuf packet alloc failures */
288 int sendCbufPktAllocFailures; /* Version 'P': send cbuf packet alloc failures */
289 int nBusies; /* Version 'R': number of busy aborts sent */
290 int spares[4];
291 };
292
293
294GETPEER (5)
295===========
296
297GETPEER (5) returns a peer information record, for the given index.
298
299 struct rx_debugPeer {
300 afs_uint32 host;
301 u_short port;
302 u_short ifMTU;
303 afs_uint32 idleWhen;
304 short refCount;
305 u_char burstSize;
306 u_char burst;
307 struct clock burstWait;
308 afs_int32 rtt;
309 afs_int32 rtt_dev;
310 struct clock timeout;
311 afs_int32 nSent;
312 afs_int32 reSends;
313 afs_int32 inPacketSkew;
314 afs_int32 outPacketSkew;
315 afs_int32 rateFlag;
316 u_short natMTU;
317 u_short maxMTU;
318 u_short maxDgramPackets;
319 u_short ifDgramPackets;
320 u_short MTU;
321 u_short cwind;
322 u_short nDgramPackets;
323 u_short congestSeq;
324 afs_hyper_t bytesSent;
325 afs_hyper_t bytesReceived;
326 afs_int32 sparel[10];
327 };
328
329