2001-04-09 Martin Grabmueller <mgrabmue@cs.tu-berlin.de>
[bpt/guile.git] / doc / scm.texi
1 @page
2 @node Scheme Primitives
3 @c @chapter Writing Scheme primitives in C
4 @c - according to the menu in guile.texi - NJ 2001/1/26
5 @chapter Relationship between Scheme and C functions
6
7 @c Chapter contents contributed by Thien-Thi Nguyen <ttn@gnu.org>.
8
9 Scheme procedures marked "primitive functions" have a regular interface
10 when calling from C, reflected in two areas: the name of a C function, and
11 the convention for passing non-required arguments to this function.
12
13 @c Although the vast majority of functions support these relationships,
14 @c there are some exceptions.
15
16 @menu
17 * Transforming Scheme name to C name::
18 * Structuring argument lists for C functions::
19 @c * Exceptions to the regularity::
20 @end menu
21
22 @node Transforming Scheme name to C name
23 @section Transforming Scheme name to C name
24
25 Normally, the name of a C function can be derived given its Scheme name,
26 using some simple textual transformations:
27
28 @itemize @bullet
29
30 @item
31 Replace @code{-} (hyphen) with @code{_} (underscore).
32
33 @item
34 Replace @code{?} (question mark) with "_p".
35
36 @item
37 Replace @code{!} (exclamation point) with "_x".
38
39 @item
40 Replace internal @code{->} with "_to_".
41
42 @item
43 Replace @code{<=} (less than or equal) with "_leq".
44
45 @item
46 Replace @code{>=} (greater than or equal) with "_geq".
47
48 @item
49 Replace @code{<} (less than) with "_less".
50
51 @item
52 Replace @code{>} (greater than) with "_gr".
53
54 @item
55 Replace @code{@@} with "at". [Omit?]
56
57 @item
58 Prefix with "gh_" (or "scm_" if you are ignoring the gh interface).
59
60 @item
61 [Anything else? --ttn, 2000/01/16 15:17:28]
62
63 @end itemize
64
65 Here is an Emacs Lisp command that prompts for a Scheme function name and
66 inserts the corresponding C function name into the buffer.
67
68 @example
69 (defun insert-scheme-to-C (name &optional use-gh)
70 "Transforms Scheme NAME, a string, to its C counterpart, and inserts it.
71 Prefix arg non-nil means use \"gh_\" prefix, otherwise use \"scm_\" prefix."
72 (interactive "sScheme name: \nP")
73 (let ((transforms '(("-" . "_")
74 ("?" . "_p")
75 ("!" . "_x")
76 ("->" . "_to_")
77 ("<=" . "_leq")
78 (">=" . "_geq")
79 ("<" . "_less")
80 (">" . "_gr")
81 ("@" . "at"))))
82 (while transforms
83 (let ((trigger (concat "\\(.*\\)"
84 (regexp-quote (caar transforms))
85 "\\(.*\\)"))
86 (sub (cdar transforms))
87 (m nil))
88 (while (setq m (string-match trigger name))
89 (setq name (concat (match-string 1 name)
90 sub
91 (match-string 2 name)))))
92 (setq transforms (cdr transforms))))
93 (insert (if use-gh "gh_" "scm_") name))
94 @end example
95
96 @node Structuring argument lists for C functions
97 @section Structuring argument lists for C functions
98
99 The C function's arguments will be all of the Scheme procedure's
100 argumements, both required and optional; if the Scheme procedure takes a
101 ``rest'' argument, that will be a final argument to the C function. The
102 C function's arguments, as well as its return type, will be @code{SCM}.
103
104 @c @node Exceptions to the regularity
105 @c @section Exceptions to the regularity
106 @c
107 @c There are some exceptions to the regular structure described above.
108
109
110 @page
111 @node I/O Extensions
112 @chapter Using and Extending Ports in C
113
114 @menu
115 * C Port Interface:: Using ports from C.
116 * Port Implementation:: How to implement a new port type in C.
117 @end menu
118
119
120 @node C Port Interface
121 @section C Port Interface
122
123 This section describes how to use Scheme ports from C.
124
125 @subsection Port basics
126
127 There are two main data structures. A port type object (ptob) is of
128 type @code{scm_ptob_descriptor}. A port instance is of type
129 @code{scm_port}. Given an @code{SCM} variable which points to a port,
130 the corresponding C port object can be obtained using the
131 @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
132 @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
133 global array.
134
135 @subsection Port buffers
136
137 An input port always has a read buffer and an output port always has a
138 write buffer. However the size of these buffers is not guaranteed to be
139 more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
140 which is used when no other buffer is allocated). The way in which the
141 buffers are allocated depends on the implementation of the ptob. For
142 example in the case of an fport, buffers may be allocated with malloc
143 when the port is created, but in the case of an strport the underlying
144 string is used as the buffer.
145
146 @subsection The @code{rw_random} flag
147
148 Special treatment is required for ports which can be seeked at random.
149 Before various operations, such as seeking the port or changing from
150 input to output on a bidirectional port or vice versa, the port
151 implemention must be given a chance to update its state. The write
152 buffer is updated by calling the @code{flush} ptob procedure and the
153 input buffer is updated by calling the @code{end_input} ptob procedure.
154 In the case of an fport, @code{flush} causes buffered output to be
155 written to the file descriptor, while @code{end_input} causes the
156 descriptor position to be adjusted to account for buffered input which
157 was never read.
158
159 The special treatment must be performed if the @code{rw_random} flag in
160 the port is non-zero.
161
162 @subsection The @code{rw_active} variable
163
164 The @code{rw_active} variable in the port is only used if
165 @code{rw_random} is set. It's defined as an enum with the following
166 values:
167
168 @table @code
169 @item SCM_PORT_READ
170 the read buffer may have unread data.
171
172 @item SCM_PORT_WRITE
173 the write buffer may have unwritten data.
174
175 @item SCM_PORT_NEITHER
176 neither the write nor the read buffer has data.
177 @end table
178
179 @subsection Reading from a port.
180
181 To read from a port, it's possible to either call existing libguile
182 procedures such as @code{scm_getc} and @code{scm_read_line} or to read
183 data from the read buffer directly. Reading from the buffer involves
184 the following steps:
185
186 @enumerate
187 @item
188 Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
189
190 @item
191 Fill the read buffer, if it's empty, using @code{scm_fill_input}.
192
193 @item Read the data from the buffer and update the read position in
194 the buffer. Steps 2) and 3) may be repeated as many times as required.
195
196 @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
197
198 @item update the port's line and column counts.
199 @end enumerate
200
201 @subsection Writing to a port.
202
203 To write data to a port, calling @code{scm_lfwrite} should be sufficient for
204 most purposes. This takes care of the following steps:
205
206 @enumerate
207 @item
208 End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
209
210 @item
211 Pass the data to the ptob implementation using the @code{write} ptob
212 procedure. The advantage of using the ptob @code{write} instead of
213 manipulating the write buffer directly is that it allows the data to be
214 written in one operation even if the port is using the single-byte
215 @code{shortbuf}.
216
217 @item
218 Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
219 is set.
220 @end enumerate
221
222
223 @node Port Implementation
224 @section Port Implementation
225
226 This section describes how to implement a new port type in C.
227
228 As described in the previous section, a port type object (ptob) is
229 a structure of type @code{scm_ptob_descriptor}. A ptob is created by
230 calling @code{scm_make_port_type}.
231
232 All of the elements of the ptob, apart from @code{name}, are procedures
233 which collectively implement the port behaviour. Creating a new port
234 type mostly involves writing these procedures.
235
236 @code{scm_make_port_type} initialises three elements of the structure
237 (@code{name}, @code{fill_input} and @code{write}) from its arguments.
238 The remaining elements are initialised with default values and can be
239 set later if required.
240
241 @table @code
242 @item name
243 A pointer to a NUL terminated string: the name of the port type. This
244 is the only element of @code{scm_ptob_descriptor} which is not
245 a procedure. Set via the first argument to @code{scm_make_port_type}.
246
247 @item mark
248 Called during garbage collection to mark any SCM objects that a port
249 object may contain. It doesn't need to be set unless the port has
250 @code{SCM} components. Set using @code{scm_set_port_mark}.
251
252 @item free
253 Called when the port is collected during gc. It
254 should free any resources used by the port.
255 Set using @code{scm_set_port_free}.
256
257 @item print
258 Called when @code{write} is called on the port object, to print a
259 port description. e.g., for an fport it may produce something like:
260 @code{#<input: /etc/passwd 3>}. Set using @code{scm_set_port_print}.
261
262 @item equalp
263 Not used at present. Set using @code{scm_set_port_equalp}.
264
265 @item close
266 Called when the port is closed, unless it was collected during gc. It
267 should free any resources used by the port.
268 Set using @code{scm_set_port_close}.
269
270 @item write
271 Accept data which is to be written using the port. The port implementation
272 may choose to buffer the data instead of processing it directly.
273 Set via the third argument to @code{scm_make_port_type}.
274
275 @item flush
276 Complete the processing of buffered output data. Reset the value of
277 @code{rw_active} to @code{SCM_PORT_NEITHER}.
278 Set using @code{scm_set_port_flush}.
279
280 @item end_input
281 Perform any synchronisation required when switching from input to output
282 on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
283 Set using @code{scm_set_port_end_input}.
284
285 @item fill_input
286 Read new data into the read buffer and return the first character. It
287 can be assumed that the read buffer is empty when this procedure is called.
288 Set via the second argument to @code{scm_make_port_type}.
289
290 @item input_waiting
291 Return a lower bound on the number of bytes that could be read from the
292 port without blocking. It can be assumed that the current state of
293 @code{rw_active} is @code{SCM_PORT_NEITHER}.
294 Set using @code{scm_set_port_input_waiting}.
295
296 @item seek
297 Set the current position of the port. The procedure can not make
298 any assumptions about the value of @code{rw_active} when it's
299 called. It can reset the buffers first if desired by using something
300 like:
301
302 @example
303 if (pt->rw_active == SCM_PORT_READ)
304 scm_end_input (object);
305 else if (pt->rw_active == SCM_PORT_WRITE)
306 ptob->flush (object);
307 @end example
308
309 However note that this will have the side effect of discarding any data
310 in the unread-char buffer, in addition to any side effects from the
311 @code{end_input} and @code{flush} ptob procedures. This is undesirable
312 when seek is called to measure the current position of the port, i.e.,
313 @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
314 implementations take care to avoid this problem.
315
316 The procedure is set using @code{scm_set_port_seek}.
317
318 @item truncate
319 Truncate the port data to be specified length. It can be assumed that the
320 current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
321 Set using @code{scm_set_port_truncate}.
322
323 @end table
324
325
326 @node Handling Errors
327 @chapter How to Handle Errors in C Code
328
329 Error handling is based on catch and throw. Errors are always thrown with
330 a key and four arguments:
331
332 @itemize @bullet
333 @item
334 key: a symbol which indicates the type of error. The symbols used
335 by libguile are listed below.
336
337 @item
338 subr: the name of the procedure from which the error is thrown, or #f.
339
340 @item
341 message: a string (possibly language and system dependent) describing the
342 error. The tokens %s and %S can be embedded within the message: they
343 will be replaced with members of the args list when the message is
344 printed. %s indicates an argument printed using "display", while %S
345 indicates an argument printed using "write". message can also be #f,
346 to allow it to be derived from the key by the error handler (may be
347 useful if the key is to be thrown from both C and Scheme).
348
349 @item
350 args: a list of arguments to be used to expand %s and %S tokens in message.
351 Can also be #f if no arguments are required.
352
353 @item
354 rest: a list of any additional objects required. e.g., when the key is
355 'system-error, this contains the C errno value. Can also be #f if no
356 additional objects are required.
357 @end itemize
358
359 In addition to catch and throw, the following Scheme facilities are
360 available:
361
362 @itemize @bullet
363 @item
364 (scm-error key subr message args rest): throw an error, with arguments
365 as described above.
366
367 @item
368 (error msg arg ...) Throw an error using the key 'misc-error. The error
369 message is created by displaying msg and writing the args.
370 @end itemize
371
372 The following are the error keys defined by libguile and the situations
373 in which they are used:
374
375 @itemize @bullet
376 @item
377 error-signal: thrown after receiving an unhandled fatal signal such as
378 SIGSEV, SIGBUS, SIGFPE etc. The "rest" argument in the throw contains
379 the coded signal number (at present this is not the same as the usual
380 Unix signal number).
381
382 @item
383 system-error: thrown after the operating system indicates an error
384 condition. The "rest" argument in the throw contains the errno value.
385
386 @item
387 numerical-overflow: numerical overflow.
388
389 @item
390 out-of-range: the arguments to a procedure do not fall within the
391 accepted domain.
392
393 @item
394 wrong-type-arg: an argument to a procedure has the wrong thpe.
395
396 @item
397 wrong-number-of-args: a procedure was called with the wrong number of
398 arguments.
399
400 @item
401 memory-allocation-error: memory allocation error.
402
403 @item
404 stack-overflow: stack overflow error.
405
406 @item
407 regex-error: errors generated by the regular expression library.
408
409 @item
410 misc-error: other errors.
411 @end itemize
412
413
414 @section C Support
415
416 SCM scm_error (SCM key, char *subr, char *message, SCM args, SCM rest)
417
418 Throws an error, after converting the char * arguments to Scheme strings.
419 subr is the Scheme name of the procedure, NULL is converted to #f.
420 Likewise a NULL message is converted to #f.
421
422 The following procedures invoke scm_error with various error keys and
423 arguments. The first three call scm_error with the system-error key
424 and automatically supply errno in the "rest" argument: scm_syserror
425 generates messages using strerror, scm_sysmissing is used when
426 facilities are not available. Care should be taken that the errno
427 value is not reset (e.g. due to an interrupt).
428
429 @itemize @bullet
430 @item
431 void scm_syserror (char *subr);
432 @item
433 void scm_syserror_msg (char *subr, char *message, SCM args);
434 @item
435 void scm_sysmissing (char *subr);
436 @item
437 void scm_num_overflow (char *subr);
438 @item
439 void scm_out_of_range (char *subr, SCM bad_value);
440 @item
441 void scm_wrong_num_args (SCM proc);
442 @item
443 void scm_wrong_type_arg (char *subr, int pos, SCM bad_value);
444 @item
445 void scm_memory_error (char *subr);
446 @item
447 static void scm_regex_error (char *subr, int code); (only used in rgx.c).
448 @end itemize
449
450 Exception handlers can also be installed from C, using
451 scm_internal_catch, scm_lazy_catch, or scm_stack_catch from
452 libguile/throw.c. These have not yet been documented, however the
453 source contains some useful comments.