Typo.
[bpt/guile.git] / doc / ref / libguile-smobs.texi
CommitLineData
3229f68b
MV
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
3@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@node Defining New Types (Smobs)
8@section Defining New Types (Smobs)
9
10@dfn{Smobs} are Guile's mechanism for adding new primitive types to
11the system. The term ``smob'' was coined by Aubrey Jaffer, who says
12it comes from ``small object'', referring to the fact that they are
13quite limited in size: they can hold just one pointer to a larger
14memory block plus 16 extra bits.
15
16To define a new smob type, the programmer provides Guile with some
17essential information about the type --- how to print it, how to
18garbage collect it, and so on --- and Guile allocates a fresh type tag
19for it. The programmer can then use @code{scm_c_define_gsubr} to make
20a set of C functions visible to Scheme code that create and operate on
21these objects.
22
23(You can find a complete version of the example code used in this
24section in the Guile distribution, in @file{doc/example-smob}. That
25directory includes a makefile and a suitable @code{main} function, so
26you can build a complete interactive Guile shell, extended with the
27datatypes described here.)
28
29@menu
30* Describing a New Type::
31* Creating Instances::
32* Type checking::
33* Garbage Collecting Smobs::
34* Garbage Collecting Simple Smobs::
35* Remembering During Operations::
36* Double Smobs::
37* The Complete Example::
38@end menu
39
40@node Describing a New Type
41@subsection Describing a New Type
42
43To define a new type, the programmer must write four functions to
44manage instances of the type:
45
46@table @code
47@item mark
48Guile will apply this function to each instance of the new type it
49encounters during garbage collection. This function is responsible for
50telling the collector about any other @code{SCM} values that the object
51has stored. The default smob mark function does nothing.
52@xref{Garbage Collecting Smobs}, for more details.
53
54@item free
55Guile will apply this function to each instance of the new type that is
56to be deallocated. The function should release all resources held by
57the object. This is analogous to the Java finalization method-- it is
58invoked at an unspecified time (when garbage collection occurs) after
59the object is dead. The default free function frees the smob data (if
60the size of the struct passed to @code{scm_make_smob_type} is non-zero)
61using @code{scm_gc_free}. @xref{Garbage Collecting Smobs}, for more
62details.
63
64@item print
65Guile will apply this function to each instance of the new type to print
66the value, as for @code{display} or @code{write}. The default print
67function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
68argument passed to @code{scm_make_smob_type}. For more information on
69printing, see @ref{Port Data}.
70
71@item equalp
72If Scheme code asks the @code{equal?} function to compare two instances
73of the same smob type, Guile calls this function. It should return
74@code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
75@code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
76@code{NULL}, @code{equal?} will assume that two instances of this type are
77never @code{equal?} unless they are @code{eq?}.
78
79@end table
80
81To actually register the new smob type, call @code{scm_make_smob_type}.
82It returns a value of type @code{scm_t_bits} which identifies the new
83smob type.
84
85The four special functions descrtibed above are registered by calling
86one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
87@code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
88appropriate. Each function is intended to be used at most once per
89type, and the call should be placed immediately following the call to
90@code{scm_make_smob_type}.
91
92There can only be at most 256 different smob types in the system.
93Instead of registering a huge number of smob types (for example, one
94for each relevant C struct in your application), it is sometimes
95better to register just one and implement a second alyer of type
96dispatching on top of it. This second layer might use the 16 extra
97bits for as an extended type, for example.
98
99Here is how one might declare and register a new type representing
100eight-bit gray-scale images:
101
102@example
103#include <libguile.h>
104
105struct image @{
106 int width, height;
107 char *pixels;
108
109 /* The name of this image */
110 SCM name;
111
112 /* A function to call when this image is
113 modified, e.g., to update the screen,
114 or SCM_BOOL_F if no action necessary */
115 SCM update_func;
116@};
117
118static scm_t_bits image_tag;
119
120void
121init_image_type (void)
122@{
123 image_tag = scm_make_smob_type ("image", sizeof (struct image));
124 scm_set_smob_mark (image_tag, mark_image);
125 scm_set_smob_free (image_tag, free_image);
126 scm_set_smob_print (image_tag, print_image);
127@}
128@end example
129
130
131@node Creating Instances
132@subsection Creating Instances
133
134Normally, smobs can have one @emph{immediate} words of data. This word
135stores either a pointer to an additional memory block that holds the
136real data, or it might hold the data itself when it fits. The word is
137of type @code{scm_t_bits} and is large enough for a @code{SCM} value or
138a pointer to @code{void}.
139
140You can also create smobs that have two or three immediate words, and
141when these words suffice to store all data, it is more efficient to use
142these super-sized smobs instead of using a normal smob plus a memory
143block. @xref{Double Smobs}, for their discussion.
144
145To retrieve the immediate word of a smob, you use the macro
146@code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
147The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
148@code{SCM_SET_SMOB_FLAGS}.
149
150Guile provides functions for managing memory which are often helpful
151when implementing smobs. @xref{Memory Blocks}.
152
153Creating a smob instance can be tricky when it consists of multiple
154steps that allocate resources and might fail. It is recommended that
155you go about creating a smob in the following way:
156
157@itemize
158@item
159Allocate the memory block for holding the data with
160@code{scm_gc_malloc}.
161@item
162Initialize it to a valid state without calling any functions that might
163cause a non-local exits. For example, initialize pointers to NULL.
164Also, do not store @code{SCM} values in it that must be protected.
165Initialize these fields with @code{SCM_BOOL_F}.
166
167A valid state is one that can be safely acted upon by the @emph{mark}
168and @emph{free} functions of your smob type.
169@item
170Create the smob using @code{SCM_NEWSMOB}, passing it the initialized
171memory block. (This step will always succeed.)
172@item
173Complete the initialization of the memory block by, for example,
174allocating additional resources and making it point to them.
175@end itemize
176
177This precedure ensures that the smob is in a valid state as soon as it
178exists, that all resources that are allocated for the smob are properly
179associated with it so that they can be properly freed, and that no
180@code{SCM} values that need to be protected are stored in it while the
181smob does not yet competely exist and thus can not protect them.
182
183Continuing the example from above, if the global variable
184@code{image_tag} contains a tag returned by @code{scm_make_smob_type},
185here is how we could construct a smob whose immediate word contains a
186pointer to a freshly allocated @code{struct image}:
187
188@example
189SCM
190make_image (SCM name, SCM s_width, SCM s_height)
191@{
192 SCM smob;
193 struct image *image;
194 int width = scm_to_int (s_width);
195 int height = scm_to_int (s_height);
196
197 /* Step 1: Allocate the memory block.
198 */
199 image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
200
201 /* Step 2: Initialize it with straight code.
202 */
203 image->width = width;
204 image->height = height;
205 image->pixels = NULL;
206 image->name = SCM_BOOL_F;
207 image->update_func = SCM_BOOL_F;
208
209 /* Step 3: Create the smob.
210 */
211 SCM_NEWSMOB (smob, image);
212
213 /* Step 4: Finish the initialization.
214 */
215 image->name = name;
216 image->pixels = scm_gc_malloc (width * height, "image pixels");
217
218 return smob;
219@}
220@end example
221
222Let us look at what might happen when @code{make_image} is called.
223
224The conversions of @var{s_width} and @var{s_height} to @code{int}s might
225fail and signal an error, thus causing a non-local exit. This is not a
226problem since no resources have been allocated yet that would have to be
227freed.
228
229The allocation of @var{image} in step 1 might fail, but this is likewise
230no problem.
231
232Step 2 can not exit non-locally. At the end of it, the @var{image}
233struct is in a valid state for the @code{mark_image} and
234@code{free_image} functions (see below).
235
236Step 3 can not exit non-locally either. This is guaranteed by Guile.
237After it, @var{smob} contains a valid smob that is properly initialized
238and protected, and in turn can properly protect the Scheme values in its
239@var{image} struct.
240
241But before the smob is completely created, @code{SCM_NEWSMOB} might
242cause the garbage collector to run. During this garbage collection, the
243@code{SCM} values in the @var{image} struct would be invisible to Guile.
244It only gets to know about them via the @code{mark_image} function, but
245that function can not yet do its job since the smob has not been created
246yet. Thus, it is important to not store @code{SCM} values in the
247@var{image} struct until after the smob has been created.
248
249Step 4, finally, might fail and cause a non-local exit. In that case,
250the creation of the smob has not been successful. It will eventually be
251freed by the garbage collector, and all the resources that have been
252allocated for it will be correctly freed by @code{free_image}.
253
254@node Type checking
255@subsection Type checking
256
257Functions that operate on smobs should check that the passed @code{SCM}
258value indeed is a suitable smob before accessing its data.
259
260For example, here is a simple function that operates on an image smob,
261and checks the type of its argument.
262
263@example
264SCM
265clear_image (SCM image_smob)
266@{
267 int area;
268 struct image *image;
269
270 SCM_ASSERT (SCM_SMOB_PREDICATE (image_tag, image_smob),
271 image_smob, SCM_ARG1, "clear-image");
272
273 image = (struct image *) SCM_SMOB_DATA (image_smob);
274 area = image->width * image->height;
275 memset (image->pixels, 0, area);
276
277 /* Invoke the image's update function.
278 */
279 if (scm_is_true (image->update_func))
280 scm_call_0 (image->update_func);
281
282 scm_remember_upto_here_1 (image_smob);
283
284 return SCM_UNSPECIFIED;
285@}
286@end example
287
288See @ref{Remembering During Operations} for an explanation of the call
289to @code{scm_remember_upto_here_1}.
290
291
292@node Garbage Collecting Smobs
293@subsection Garbage Collecting Smobs
294
295Once a smob has been released to the tender mercies of the Scheme
296system, it must be prepared to survive garbage collection. Guile calls
297the @emph{mark} and @emph{free} functions of the smob to manage this.
298
299As described in more detail elsewhere (@pxref{Conservative GC}), every
300object in the Scheme system has a @dfn{mark bit}, which the garbage
301collector uses to tell live objects from dead ones. When collection
302starts, every object's mark bit is clear. The collector traces pointers
303through the heap, starting from objects known to be live, and sets the
304mark bit on each object it encounters. When it can find no more
305unmarked objects, the collector walks all objects, live and dead, frees
306those whose mark bits are still clear, and clears the mark bit on the
307others.
308
309The two main portions of the collection are called the @dfn{mark phase},
310during which the collector marks live objects, and the @dfn{sweep
311phase}, during which the collector frees all unmarked objects.
312
313The mark bit of a smob lives in a special memory region. When the
314collector encounters a smob, it sets the smob's mark bit, and uses the
315smob's type tag to find the appropriate @emph{mark} function for that
316smob. It then calls this @emph{mark} function, passing it the smob as
317its only argument.
318
319The @emph{mark} function is responsible for marking any other Scheme
320objects the smob refers to. If it does not do so, the objects' mark
321bits will still be clear when the collector begins to sweep, and the
322collector will free them. If this occurs, it will probably break, or at
323least confuse, any code operating on the smob; the smob's @code{SCM}
324values will have become dangling references.
325
326To mark an arbitrary Scheme object, the @emph{mark} function calls
327@code{scm_gc_mark}.
328
329Thus, here is how we might write @code{mark_image}:
330
331@example
332@group
333SCM
334mark_image (SCM image_smob)
335@{
336 /* Mark the image's name and update function. */
337 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
338
339 scm_gc_mark (image->name);
340 scm_gc_mark (image->update_func);
341
342 return SCM_BOOL_F;
343@}
344@end group
345@end example
346
347Note that, even though the image's @code{update_func} could be an
348arbitrarily complex structure (representing a procedure and any values
349enclosed in its environment), @code{scm_gc_mark} will recurse as
350necessary to mark all its components. Because @code{scm_gc_mark} sets
351an object's mark bit before it recurses, it is not confused by
352circular structures.
353
354As an optimization, the collector will mark whatever value is returned
355by the @emph{mark} function; this helps limit depth of recursion during
356the mark phase. Thus, the code above should really be written as:
357@example
358@group
359SCM
360mark_image (SCM image_smob)
361@{
362 /* Mark the image's name and update function. */
363 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
364
365 scm_gc_mark (image->name);
366 return image->update_func;
367@}
368@end group
369@end example
370
371
372Finally, when the collector encounters an unmarked smob during the sweep
373phase, it uses the smob's tag to find the appropriate @emph{free}
374function for the smob. It then calls that function, passing it the smob
375as its only argument.
376
377The @emph{free} function must release any resources used by the smob.
378However, it must not free objects managed by the collector; the
379collector will take care of them. For historical reasons, the return
380type of the @emph{free} function should be @code{size_t}, an unsigned
381integral type; the @emph{free} function should always return zero.
382
383Here is how we might write the @code{free_image} function for the image
384smob type:
385@example
386size_t
387free_image (SCM image_smob)
388@{
389 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
390
391 scm_gc_free (image->pixels, image->width * image->height, "image pixels");
392 scm_gc_free (image, sizeof (struct image), "image");
393
394 return 0;
395@}
396@end example
397
398During the sweep phase, the garbage collector will clear the mark bits
399on all live objects. The code which implements a smob need not do this
400itself.
401
402There is no way for smob code to be notified when collection is
403complete.
404
405It is usually a good idea to minimize the amount of processing done
406during garbage collection; keep the @emph{mark} and @emph{free}
407functions very simple. Since collections occur at unpredictable times,
408it is easy for any unusual activity to interfere with normal code.
409
410
411@node Garbage Collecting Simple Smobs
412@subsection Garbage Collecting Simple Smobs
413
414It is often useful to define very simple smob types --- smobs which have
415no data to mark, other than the cell itself, or smobs whose immediate
416data word is simply an ordinary Scheme object, to be marked recursively.
417Guile provides some functions to handle these common cases; you can use
418this function as your smob type's @emph{mark} function, if your smob's
419structure is simple enough.
420
421If the smob refers to no other Scheme objects, then no action is
422necessary; the garbage collector has already marked the smob cell
423itself. In that case, you can use zero as your mark function.
424
425@deftypefun SCM scm_markcdr (SCM @var{x})
426Mark the references in the smob @var{x}, assuming that @var{x}'s first
427data word contains an ordinary Scheme object, and @var{x} refers to no
428other objects. This function simply returns @var{x}'s first data word.
429
430This is only useful for simple smobs created by @code{SCM_NEWSMOB} or
431@code{SCM_RETURN_NEWSMOB}, not for smobs allocated as double cells.
432@end deftypefun
433
434@node Remembering During Operations
435@subsection Remembering During Operations
436@cindex Remembering
437
438It's important that a smob is visible to the garbage collector
439whenever its contents are being accessed. Otherwise it could be freed
440while code is still using it.
441
442For example, consider a procedure to convert image data to a list of
443pixel values.
444
445@example
446SCM
447image_to_list (SCM image_smob)
448@{
449 struct image *image;
450 SCM lst;
451 int i;
452 SCM_ASSERT (SCM_SMOB_PREDICATE (image_tag, image_smob),
453 image_smob, SCM_ARG1, "image->list");
454
455 image = (struct image *) SCM_SMOB_DATA (image_smob);
456 lst = SCM_EOL;
457 for (i = image->width * image->height - 1; i >= 0; i--)
458 lst = scm_cons (scm_from_char (image->pixels[i]), lst);
459
460 scm_remember_upto_here_1 (image_smob);
461 return lst;
462@}
463@end example
464
465In the loop, only the @code{image} pointer is used and the C compiler
466has no reason to keep the @code{image_smob} value anywhere. If
467@code{scm_cons} results in a garbage collection, @code{image_smob} might
468not be on the stack or anywhere else and could be freed, leaving the
469loop accessing freed data. The use of @code{scm_remember_upto_here_1}
470prevents this, by creating a reference to @code{image_smob} after all
471data accesses.
472
473There's no need to do the same for @code{lst}, since that's the return
474value and the compiler will certainly keep it in a register or
475somewhere throughout the routine.
476
477The @code{clear_image} example previously shown (@pxref{Type checking})
478also used @code{scm_remember_upto_here_1} for this reason.
479
480It's only in quite rare circumstances that a missing
481@code{scm_remember_upto_here_1} will bite, but when it happens the
482consequences are serious. Fortunately the rule is simple: whenever
483calling a Guile library function or doing something that might, ensure
484the @code{SCM} of a smob is referenced past all accesses to its
485insides. Do this by adding an @code{scm_remember_upto_here_1} if
486there are no other references.
487
488In a multi-threaded program, the rule is the same. As far as a given
489thread is concerned, a garbage collection still only occurs within a
490Guile library function, not at an arbitrary time. (Guile waits for all
491threads to reach one of its library functions, and holds them there
492while the collector runs.)
493
494@node Double Smobs
495@subsection Double Smobs
496
497Smobs are called smob because they are small: they normally have only
498room for one @code{scm_t_bits} value plus 16 bits. The reason for
499this is that smobs are directly implemented by using the low-level,
500two-word cells of Guile that are also used to implement pairs, for
501example. (@pxref{Data Representation} for the details.) One word of
502the two-word cells is used for @code{SCM_SMOB_DATA}, the other
503contains the 16-bit type tag and the 16 extra bits.
504
505In addition to the fundamental two-word cells, Guile also has
506four-word cells, which are appropriately called @dfn{double cells}.
507You can use them for @dfn{double smobs} and get two more immediate
508words of type @code{scm_t_bits}.
509
510A double smob is created with @code{SCM_NEWSMOB2} or
511@code{SCM_NEWSMOB3} instead of @code{SCM_NEWSMOB}. Its immediate
512words can be retrieved with @code{SCM_SMOB_DATA2} and
513@code{SCM_SMOB_DATA3} in addition to @code{SCM_SMOB_DATA}.
514Unsurprisingly, the words can be set with @code{SCM_SET_SMOB_DATA2}
515and @code{SCM_SET_SMOB_DATA3}.
516
517@node The Complete Example
518@subsection The Complete Example
519
520Here is the complete text of the implementation of the image datatype,
521as presented in the sections above. We also provide a definition for
522the smob's @emph{print} function, and make some objects and functions
523static, to clarify exactly what the surrounding code is using.
524
525As mentioned above, you can find this code in the Guile distribution, in
526@file{doc/example-smob}. That directory includes a makefile and a
527suitable @code{main} function, so you can build a complete interactive
528Guile shell, extended with the datatypes described here.)
529
530@example
531/* file "image-type.c" */
532
533#include <stdlib.h>
534#include <libguile.h>
535
536static scm_t_bits image_tag;
537
538struct image @{
539 int width, height;
540 char *pixels;
541
542 /* The name of this image */
543 SCM name;
544
545 /* A function to call when this image is
546 modified, e.g., to update the screen,
547 or SCM_BOOL_F if no action necessary */
548 SCM update_func;
549@};
550
551static SCM
552make_image (SCM name, SCM s_width, SCM s_height)
553@{
554 SCM smob;
555 struct image *image;
556 int width = scm_to_int (s_width);
557 int height = scm_to_int (s_height);
558
559 /* Step 1: Allocate the memory block.
560 */
561 image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
562
563 /* Step 2: Initialize it with straight code.
564 */
565 image->width = width;
566 image->height = height;
567 image->pixels = NULL;
568 image->name = SCM_BOOL_F;
569 image->update_func = SCM_BOOL_F;
570
571 /* Step 3: Create the smob.
572 */
573 SCM_NEWSMOB (smob, image);
574
575 /* Step 4: Finish the initialization.
576 */
577 image->name = name;
578 image->pixels = scm_gc_malloc (width * height, "image pixels");
579
580 return smob;
581@}
582
583SCM
584clear_image (SCM image_smob)
585@{
586 int area;
587 struct image *image;
588
589 SCM_ASSERT (SCM_SMOB_PREDICATE (image_tag, image_smob),
590 image_smob, SCM_ARG1, "clear-image");
591
592 image = (struct image *) SCM_SMOB_DATA (image_smob);
593 area = image->width * image->height;
594 memset (image->pixels, 0, area);
595
596 /* Invoke the image's update function.
597 */
598 if (scm_is_true (image->update_func))
599 scm_call_0 (image->update_func);
600
601 scm_remember_upto_here_1 (image_smob);
602
603 return SCM_UNSPECIFIED;
604@}
605
606static SCM
607mark_image (SCM image_smob)
608@{
609 /* Mark the image's name and update function. */
610 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
611
612 scm_gc_mark (image->name);
613 return image->update_func;
614@}
615
616static size_t
617free_image (SCM image_smob)
618@{
619 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
620
621 scm_gc_free (image->pixels, image->width * image->height, "image pixels");
622 scm_gc_free (image, sizeof (struct image), "image");
623
624 return 0;
625@}
626
627static int
628print_image (SCM image_smob, SCM port, scm_print_state *pstate)
629@{
630 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
631
632 scm_puts ("#<image ", port);
633 scm_display (image->name, port);
634 scm_puts (">", port);
635
636 /* non-zero means success */
637 return 1;
638@}
639
640void
641init_image_type (void)
642@{
643 image_tag = scm_make_smob_type ("image", sizeof (struct image));
644 scm_set_smob_mark (image_tag, mark_image);
645 scm_set_smob_free (image_tag, free_image);
646 scm_set_smob_print (image_tag, print_image);
647
648 scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
649 scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
650@}
651@end example
652
653Here is a sample build and interaction with the code from the
654@file{example-smob} directory, on the author's machine:
655
656@example
657zwingli:example-smob$ make CC=gcc
658gcc `guile-config compile` -c image-type.c -o image-type.o
659gcc `guile-config compile` -c myguile.c -o myguile.o
660gcc image-type.o myguile.o `guile-config link` -o myguile
661zwingli:example-smob$ ./myguile
662guile> make-image
663#<primitive-procedure make-image>
664guile> (define i (make-image "Whistler's Mother" 100 100))
665guile> i
666#<image Whistler's Mother>
667guile> (clear-image i)
668guile> (clear-image 4)
669ERROR: In procedure clear-image in expression (clear-image 4):
670ERROR: Wrong type argument in position 1: 4
671ABORT: (wrong-type-arg)
672
673Type "(backtrace)" to get more information.
674guile>
675@end example