2001-04-09 Martin Grabmueller <mgrabmue@cs.tu-berlin.de>
[bpt/guile.git] / doc / scm.texi
CommitLineData
38a93523
NJ
1@page
2@node Scheme Primitives
3@c @chapter Writing Scheme primitives in C
4@c - according to the menu in guile.texi - NJ 2001/1/26
5@chapter Relationship between Scheme and C functions
6
7@c Chapter contents contributed by Thien-Thi Nguyen <ttn@gnu.org>.
8
9Scheme procedures marked "primitive functions" have a regular interface
10when calling from C, reflected in two areas: the name of a C function, and
11the convention for passing non-required arguments to this function.
12
13@c Although the vast majority of functions support these relationships,
14@c there are some exceptions.
15
16@menu
17* Transforming Scheme name to C name::
18* Structuring argument lists for C functions::
19@c * Exceptions to the regularity::
20@end menu
21
22@node Transforming Scheme name to C name
23@section Transforming Scheme name to C name
24
25Normally, the name of a C function can be derived given its Scheme name,
26using some simple textual transformations:
27
28@itemize @bullet
29
30@item
31Replace @code{-} (hyphen) with @code{_} (underscore).
32
33@item
34Replace @code{?} (question mark) with "_p".
35
36@item
37Replace @code{!} (exclamation point) with "_x".
38
39@item
40Replace internal @code{->} with "_to_".
41
42@item
43Replace @code{<=} (less than or equal) with "_leq".
44
45@item
46Replace @code{>=} (greater than or equal) with "_geq".
47
48@item
49Replace @code{<} (less than) with "_less".
50
51@item
52Replace @code{>} (greater than) with "_gr".
53
54@item
55Replace @code{@@} with "at". [Omit?]
56
57@item
58Prefix with "gh_" (or "scm_" if you are ignoring the gh interface).
59
60@item
61[Anything else? --ttn, 2000/01/16 15:17:28]
62
63@end itemize
64
65Here is an Emacs Lisp command that prompts for a Scheme function name and
66inserts the corresponding C function name into the buffer.
67
68@example
69(defun insert-scheme-to-C (name &optional use-gh)
70 "Transforms Scheme NAME, a string, to its C counterpart, and inserts it.
71Prefix arg non-nil means use \"gh_\" prefix, otherwise use \"scm_\" prefix."
72 (interactive "sScheme name: \nP")
73 (let ((transforms '(("-" . "_")
74 ("?" . "_p")
75 ("!" . "_x")
76 ("->" . "_to_")
77 ("<=" . "_leq")
78 (">=" . "_geq")
79 ("<" . "_less")
80 (">" . "_gr")
81 ("@" . "at"))))
82 (while transforms
83 (let ((trigger (concat "\\(.*\\)"
84 (regexp-quote (caar transforms))
85 "\\(.*\\)"))
86 (sub (cdar transforms))
87 (m nil))
88 (while (setq m (string-match trigger name))
89 (setq name (concat (match-string 1 name)
90 sub
91 (match-string 2 name)))))
92 (setq transforms (cdr transforms))))
93 (insert (if use-gh "gh_" "scm_") name))
94@end example
95
96@node Structuring argument lists for C functions
97@section Structuring argument lists for C functions
98
99The C function's arguments will be all of the Scheme procedure's
100argumements, both required and optional; if the Scheme procedure takes a
101``rest'' argument, that will be a final argument to the C function. The
102C function's arguments, as well as its return type, will be @code{SCM}.
103
104@c @node Exceptions to the regularity
105@c @section Exceptions to the regularity
106@c
107@c There are some exceptions to the regular structure described above.
108
109
110@page
111@node I/O Extensions
112@chapter Using and Extending Ports in C
113
114@menu
115* C Port Interface:: Using ports from C.
116* Port Implementation:: How to implement a new port type in C.
117@end menu
118
119
120@node C Port Interface
121@section C Port Interface
122
123This section describes how to use Scheme ports from C.
124
125@subsection Port basics
126
127There are two main data structures. A port type object (ptob) is of
128type @code{scm_ptob_descriptor}. A port instance is of type
129@code{scm_port}. Given an @code{SCM} variable which points to a port,
130the corresponding C port object can be obtained using the
131@code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
132@code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
133global array.
134
135@subsection Port buffers
136
137An input port always has a read buffer and an output port always has a
138write buffer. However the size of these buffers is not guaranteed to be
139more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
140which is used when no other buffer is allocated). The way in which the
141buffers are allocated depends on the implementation of the ptob. For
142example in the case of an fport, buffers may be allocated with malloc
143when the port is created, but in the case of an strport the underlying
144string is used as the buffer.
145
146@subsection The @code{rw_random} flag
147
148Special treatment is required for ports which can be seeked at random.
149Before various operations, such as seeking the port or changing from
150input to output on a bidirectional port or vice versa, the port
151implemention must be given a chance to update its state. The write
152buffer is updated by calling the @code{flush} ptob procedure and the
153input buffer is updated by calling the @code{end_input} ptob procedure.
154In the case of an fport, @code{flush} causes buffered output to be
155written to the file descriptor, while @code{end_input} causes the
156descriptor position to be adjusted to account for buffered input which
157was never read.
158
159The special treatment must be performed if the @code{rw_random} flag in
160the port is non-zero.
161
162@subsection The @code{rw_active} variable
163
164The @code{rw_active} variable in the port is only used if
165@code{rw_random} is set. It's defined as an enum with the following
166values:
167
168@table @code
169@item SCM_PORT_READ
170the read buffer may have unread data.
171
172@item SCM_PORT_WRITE
173the write buffer may have unwritten data.
174
175@item SCM_PORT_NEITHER
176neither the write nor the read buffer has data.
177@end table
178
179@subsection Reading from a port.
180
181To read from a port, it's possible to either call existing libguile
182procedures such as @code{scm_getc} and @code{scm_read_line} or to read
183data from the read buffer directly. Reading from the buffer involves
184the following steps:
185
186@enumerate
187@item
188Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
189
190@item
191Fill the read buffer, if it's empty, using @code{scm_fill_input}.
192
193@item Read the data from the buffer and update the read position in
194the buffer. Steps 2) and 3) may be repeated as many times as required.
195
196@item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
197
198@item update the port's line and column counts.
199@end enumerate
200
201@subsection Writing to a port.
202
203To write data to a port, calling @code{scm_lfwrite} should be sufficient for
204most purposes. This takes care of the following steps:
205
206@enumerate
207@item
208End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
209
210@item
211Pass the data to the ptob implementation using the @code{write} ptob
212procedure. The advantage of using the ptob @code{write} instead of
213manipulating the write buffer directly is that it allows the data to be
214written in one operation even if the port is using the single-byte
215@code{shortbuf}.
216
217@item
218Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
219is set.
220@end enumerate
221
222
223@node Port Implementation
224@section Port Implementation
225
226This section describes how to implement a new port type in C.
227
228As described in the previous section, a port type object (ptob) is
229a structure of type @code{scm_ptob_descriptor}. A ptob is created by
230calling @code{scm_make_port_type}.
231
232All of the elements of the ptob, apart from @code{name}, are procedures
233which collectively implement the port behaviour. Creating a new port
234type mostly involves writing these procedures.
235
236@code{scm_make_port_type} initialises three elements of the structure
237(@code{name}, @code{fill_input} and @code{write}) from its arguments.
238The remaining elements are initialised with default values and can be
239set later if required.
240
241@table @code
242@item name
243A pointer to a NUL terminated string: the name of the port type. This
244is the only element of @code{scm_ptob_descriptor} which is not
245a procedure. Set via the first argument to @code{scm_make_port_type}.
246
247@item mark
248Called during garbage collection to mark any SCM objects that a port
249object may contain. It doesn't need to be set unless the port has
250@code{SCM} components. Set using @code{scm_set_port_mark}.
251
252@item free
253Called when the port is collected during gc. It
254should free any resources used by the port.
255Set using @code{scm_set_port_free}.
256
257@item print
258Called when @code{write} is called on the port object, to print a
259port description. e.g., for an fport it may produce something like:
260@code{#<input: /etc/passwd 3>}. Set using @code{scm_set_port_print}.
261
262@item equalp
263Not used at present. Set using @code{scm_set_port_equalp}.
264
265@item close
266Called when the port is closed, unless it was collected during gc. It
267should free any resources used by the port.
268Set using @code{scm_set_port_close}.
269
270@item write
271Accept data which is to be written using the port. The port implementation
272may choose to buffer the data instead of processing it directly.
273Set via the third argument to @code{scm_make_port_type}.
274
275@item flush
276Complete the processing of buffered output data. Reset the value of
277@code{rw_active} to @code{SCM_PORT_NEITHER}.
278Set using @code{scm_set_port_flush}.
279
280@item end_input
281Perform any synchronisation required when switching from input to output
282on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
283Set using @code{scm_set_port_end_input}.
284
285@item fill_input
286Read new data into the read buffer and return the first character. It
287can be assumed that the read buffer is empty when this procedure is called.
288Set via the second argument to @code{scm_make_port_type}.
289
290@item input_waiting
291Return a lower bound on the number of bytes that could be read from the
292port without blocking. It can be assumed that the current state of
293@code{rw_active} is @code{SCM_PORT_NEITHER}.
294Set using @code{scm_set_port_input_waiting}.
295
296@item seek
297Set the current position of the port. The procedure can not make
298any assumptions about the value of @code{rw_active} when it's
299called. It can reset the buffers first if desired by using something
300like:
301
302@example
303 if (pt->rw_active == SCM_PORT_READ)
304 scm_end_input (object);
305 else if (pt->rw_active == SCM_PORT_WRITE)
306 ptob->flush (object);
307@end example
308
309However note that this will have the side effect of discarding any data
310in the unread-char buffer, in addition to any side effects from the
311@code{end_input} and @code{flush} ptob procedures. This is undesirable
312when seek is called to measure the current position of the port, i.e.,
313@code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
314implementations take care to avoid this problem.
315
316The procedure is set using @code{scm_set_port_seek}.
317
318@item truncate
319Truncate the port data to be specified length. It can be assumed that the
320current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
321Set using @code{scm_set_port_truncate}.
322
323@end table
324
325
326@node Handling Errors
327@chapter How to Handle Errors in C Code
328
329Error handling is based on catch and throw. Errors are always thrown with
330a key and four arguments:
331
332@itemize @bullet
333@item
334key: a symbol which indicates the type of error. The symbols used
335by libguile are listed below.
336
337@item
338subr: the name of the procedure from which the error is thrown, or #f.
339
340@item
341message: a string (possibly language and system dependent) describing the
342error. The tokens %s and %S can be embedded within the message: they
343will be replaced with members of the args list when the message is
344printed. %s indicates an argument printed using "display", while %S
345indicates an argument printed using "write". message can also be #f,
346to allow it to be derived from the key by the error handler (may be
347useful if the key is to be thrown from both C and Scheme).
348
349@item
350args: a list of arguments to be used to expand %s and %S tokens in message.
351Can also be #f if no arguments are required.
352
353@item
354rest: a list of any additional objects required. e.g., when the key is
355'system-error, this contains the C errno value. Can also be #f if no
356additional objects are required.
357@end itemize
358
359In addition to catch and throw, the following Scheme facilities are
360available:
361
362@itemize @bullet
363@item
364(scm-error key subr message args rest): throw an error, with arguments
365as described above.
366
367@item
368(error msg arg ...) Throw an error using the key 'misc-error. The error
369message is created by displaying msg and writing the args.
370@end itemize
371
372The following are the error keys defined by libguile and the situations
373in which they are used:
374
375@itemize @bullet
376@item
377error-signal: thrown after receiving an unhandled fatal signal such as
378SIGSEV, SIGBUS, SIGFPE etc. The "rest" argument in the throw contains
379the coded signal number (at present this is not the same as the usual
380Unix signal number).
381
382@item
383system-error: thrown after the operating system indicates an error
384condition. The "rest" argument in the throw contains the errno value.
385
386@item
387numerical-overflow: numerical overflow.
388
389@item
390out-of-range: the arguments to a procedure do not fall within the
391accepted domain.
392
393@item
394wrong-type-arg: an argument to a procedure has the wrong thpe.
395
396@item
397wrong-number-of-args: a procedure was called with the wrong number of
398arguments.
399
400@item
401memory-allocation-error: memory allocation error.
402
403@item
404stack-overflow: stack overflow error.
405
406@item
407regex-error: errors generated by the regular expression library.
408
409@item
410misc-error: other errors.
411@end itemize
412
413
414@section C Support
415
416SCM scm_error (SCM key, char *subr, char *message, SCM args, SCM rest)
417
418Throws an error, after converting the char * arguments to Scheme strings.
419subr is the Scheme name of the procedure, NULL is converted to #f.
420Likewise a NULL message is converted to #f.
421
422The following procedures invoke scm_error with various error keys and
423arguments. The first three call scm_error with the system-error key
424and automatically supply errno in the "rest" argument: scm_syserror
425generates messages using strerror, scm_sysmissing is used when
426facilities are not available. Care should be taken that the errno
427value is not reset (e.g. due to an interrupt).
428
429@itemize @bullet
430@item
431void scm_syserror (char *subr);
432@item
433void scm_syserror_msg (char *subr, char *message, SCM args);
434@item
435void scm_sysmissing (char *subr);
436@item
437void scm_num_overflow (char *subr);
438@item
439void scm_out_of_range (char *subr, SCM bad_value);
440@item
441void scm_wrong_num_args (SCM proc);
442@item
443void scm_wrong_type_arg (char *subr, int pos, SCM bad_value);
444@item
445void scm_memory_error (char *subr);
446@item
447static void scm_regex_error (char *subr, int code); (only used in rgx.c).
448@end itemize
449
450Exception handlers can also be installed from C, using
451scm_internal_catch, scm_lazy_catch, or scm_stack_catch from
452libguile/throw.c. These have not yet been documented, however the
453source contains some useful comments.