2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
7 @node Defining New Types (Smobs)
8 @section Defining New Types (Smobs)
10 @dfn{Smobs} are Guile's mechanism for adding new primitive types to
11 the system. The term ``smob'' was coined by Aubrey Jaffer, who says
12 it comes from ``small object'', referring to the fact that they are
13 quite limited in size: they can hold just one pointer to a larger
14 memory block plus 16 extra bits.
16 To define a new smob type, the programmer provides Guile with some
17 essential information about the type --- how to print it, how to
18 garbage collect it, and so on --- and Guile allocates a fresh type tag
19 for it. The programmer can then use @code{scm_c_define_gsubr} to make
20 a set of C functions visible to Scheme code that create and operate on
23 (You can find a complete version of the example code used in this
24 section in the Guile distribution, in @file{doc/example-smob}. That
25 directory includes a makefile and a suitable @code{main} function, so
26 you can build a complete interactive Guile shell, extended with the
27 datatypes described here.)
30 * Describing a New Type::
31 * Creating Smob Instances::
33 * Garbage Collecting Smobs::
34 * Garbage Collecting Simple Smobs::
35 * Remembering During Operations::
37 * The Complete Example::
40 @node Describing a New Type
41 @subsection Describing a New Type
43 To define a new type, the programmer must write four functions to
44 manage instances of the type:
48 Guile will apply this function to each instance of the new type it
49 encounters during garbage collection. This function is responsible for
50 telling the collector about any other @code{SCM} values that the object
51 has stored. The default smob mark function does nothing.
52 @xref{Garbage Collecting Smobs}, for more details.
55 Guile will apply this function to each instance of the new type that is
56 to be deallocated. The function should release all resources held by
57 the object. This is analogous to the Java finalization method-- it is
58 invoked at an unspecified time (when garbage collection occurs) after
59 the object is dead. The default free function frees the smob data (if
60 the size of the struct passed to @code{scm_make_smob_type} is non-zero)
61 using @code{scm_gc_free}. @xref{Garbage Collecting Smobs}, for more
64 This function operates while the heap is in an inconsistent state and
65 must therefore be careful. @xref{Smobs}, for details about what this
66 function is allowed to do.
69 Guile will apply this function to each instance of the new type to print
70 the value, as for @code{display} or @code{write}. The default print
71 function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
72 argument passed to @code{scm_make_smob_type}. For more information on
73 printing, see @ref{Port Data}.
76 If Scheme code asks the @code{equal?} function to compare two instances
77 of the same smob type, Guile calls this function. It should return
78 @code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
79 @code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
80 @code{NULL}, @code{equal?} will assume that two instances of this type are
81 never @code{equal?} unless they are @code{eq?}.
85 To actually register the new smob type, call @code{scm_make_smob_type}.
86 It returns a value of type @code{scm_t_bits} which identifies the new
89 The four special functions described above are registered by calling
90 one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
91 @code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
92 appropriate. Each function is intended to be used at most once per
93 type, and the call should be placed immediately following the call to
94 @code{scm_make_smob_type}.
96 There can only be at most 256 different smob types in the system.
97 Instead of registering a huge number of smob types (for example, one
98 for each relevant C struct in your application), it is sometimes
99 better to register just one and implement a second layer of type
100 dispatching on top of it. This second layer might use the 16 extra
101 bits to extend its type, for example.
103 Here is how one might declare and register a new type representing
104 eight-bit gray-scale images:
107 #include <libguile.h>
113 /* The name of this image */
116 /* A function to call when this image is
117 modified, e.g., to update the screen,
118 or SCM_BOOL_F if no action necessary */
122 static scm_t_bits image_tag;
125 init_image_type (void)
127 image_tag = scm_make_smob_type ("image", sizeof (struct image));
128 scm_set_smob_mark (image_tag, mark_image);
129 scm_set_smob_free (image_tag, free_image);
130 scm_set_smob_print (image_tag, print_image);
135 @node Creating Smob Instances
136 @subsection Creating Smob Instances
138 Normally, smobs can have one @emph{immediate} word of data. This word
139 stores either a pointer to an additional memory block that holds the
140 real data, or it might hold the data itself when it fits. The word is
141 large enough for a @code{SCM} value, a pointer to @code{void}, or an
142 integer that fits into a @code{size_t} or @code{ssize_t}.
144 You can also create smobs that have two or three immediate words, and
145 when these words suffice to store all data, it is more efficient to use
146 these super-sized smobs instead of using a normal smob plus a memory
147 block. @xref{Double Smobs}, for their discussion.
149 Guile provides functions for managing memory which are often helpful
150 when implementing smobs. @xref{Memory Blocks}.
152 To retrieve the immediate word of a smob, you use the macro
153 @code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
154 The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
155 @code{SCM_SET_SMOB_FLAGS}.
157 The two macros @code{SCM_SMOB_DATA} and @code{SCM_SET_SMOB_DATA} treat
158 the immediate word as if it were of type @code{scm_t_bits}, which is
159 an unsigned integer type large enough to hold a pointer to
160 @code{void}. Thus you can use these macros to store arbitrary
161 pointers in the smob word.
163 When you want to store a @code{SCM} value directly in the immediate
164 word of a smob, you should use the macros @code{SCM_SMOB_OBJECT} and
165 @code{SCM_SET_SMOB_OBJECT} to access it.
167 Creating a smob instance can be tricky when it consists of multiple
168 steps that allocate resources and might fail. It is recommended that
169 you go about creating a smob in the following way:
173 Allocate the memory block for holding the data with
174 @code{scm_gc_malloc}.
176 Initialize it to a valid state without calling any functions that might
177 cause a non-local exits. For example, initialize pointers to NULL.
178 Also, do not store @code{SCM} values in it that must be protected.
179 Initialize these fields with @code{SCM_BOOL_F}.
181 A valid state is one that can be safely acted upon by the @emph{mark}
182 and @emph{free} functions of your smob type.
184 Create the smob using @code{SCM_NEWSMOB}, passing it the initialized
185 memory block. (This step will always succeed.)
187 Complete the initialization of the memory block by, for example,
188 allocating additional resources and making it point to them.
191 This procedure ensures that the smob is in a valid state as soon as it
192 exists, that all resources that are allocated for the smob are
193 properly associated with it so that they can be properly freed, and
194 that no @code{SCM} values that need to be protected are stored in it
195 while the smob does not yet competely exist and thus can not protect
198 Continuing the example from above, if the global variable
199 @code{image_tag} contains a tag returned by @code{scm_make_smob_type},
200 here is how we could construct a smob whose immediate word contains a
201 pointer to a freshly allocated @code{struct image}:
205 make_image (SCM name, SCM s_width, SCM s_height)
209 int width = scm_to_int (s_width);
210 int height = scm_to_int (s_height);
212 /* Step 1: Allocate the memory block.
214 image = (struct image *)
215 scm_gc_malloc (sizeof (struct image), "image");
217 /* Step 2: Initialize it with straight code.
219 image->width = width;
220 image->height = height;
221 image->pixels = NULL;
222 image->name = SCM_BOOL_F;
223 image->update_func = SCM_BOOL_F;
225 /* Step 3: Create the smob.
227 SCM_NEWSMOB (smob, image_tag, image);
229 /* Step 4: Finish the initialization.
233 scm_gc_malloc (width * height, "image pixels");
239 Let us look at what might happen when @code{make_image} is called.
241 The conversions of @var{s_width} and @var{s_height} to @code{int}s might
242 fail and signal an error, thus causing a non-local exit. This is not a
243 problem since no resources have been allocated yet that would have to be
246 The allocation of @var{image} in step 1 might fail, but this is likewise
249 Step 2 can not exit non-locally. At the end of it, the @var{image}
250 struct is in a valid state for the @code{mark_image} and
251 @code{free_image} functions (see below).
253 Step 3 can not exit non-locally either. This is guaranteed by Guile.
254 After it, @var{smob} contains a valid smob that is properly initialized
255 and protected, and in turn can properly protect the Scheme values in its
258 But before the smob is completely created, @code{SCM_NEWSMOB} might
259 cause the garbage collector to run. During this garbage collection, the
260 @code{SCM} values in the @var{image} struct would be invisible to Guile.
261 It only gets to know about them via the @code{mark_image} function, but
262 that function can not yet do its job since the smob has not been created
263 yet. Thus, it is important to not store @code{SCM} values in the
264 @var{image} struct until after the smob has been created.
266 Step 4, finally, might fail and cause a non-local exit. In that case,
267 the complete creation of the smob has not been successful, but it does
268 nevertheless exist in a valid state. It will eventually be freed by
269 the garbage collector, and all the resources that have been allocated
270 for it will be correctly freed by @code{free_image}.
273 @subsection Type checking
275 Functions that operate on smobs should check that the passed
276 @code{SCM} value indeed is a suitable smob before accessing its data.
277 They can do this with @code{scm_assert_smob_type}.
279 For example, here is a simple function that operates on an image smob,
280 and checks the type of its argument.
284 clear_image (SCM image_smob)
289 scm_assert_smob_type (image_tag, image_smob);
291 image = (struct image *) SCM_SMOB_DATA (image_smob);
292 area = image->width * image->height;
293 memset (image->pixels, 0, area);
295 /* Invoke the image's update function.
297 if (scm_is_true (image->update_func))
298 scm_call_0 (image->update_func);
300 scm_remember_upto_here_1 (image_smob);
302 return SCM_UNSPECIFIED;
306 See @ref{Remembering During Operations} for an explanation of the call
307 to @code{scm_remember_upto_here_1}.
310 @node Garbage Collecting Smobs
311 @subsection Garbage Collecting Smobs
313 Once a smob has been released to the tender mercies of the Scheme
314 system, it must be prepared to survive garbage collection. Guile calls
315 the @emph{mark} and @emph{free} functions of the smob to manage this.
317 As described in more detail elsewhere (@pxref{Conservative GC}), every
318 object in the Scheme system has a @dfn{mark bit}, which the garbage
319 collector uses to tell live objects from dead ones. When collection
320 starts, every object's mark bit is clear. The collector traces pointers
321 through the heap, starting from objects known to be live, and sets the
322 mark bit on each object it encounters. When it can find no more
323 unmarked objects, the collector walks all objects, live and dead, frees
324 those whose mark bits are still clear, and clears the mark bit on the
327 The two main portions of the collection are called the @dfn{mark phase},
328 during which the collector marks live objects, and the @dfn{sweep
329 phase}, during which the collector frees all unmarked objects.
331 The mark bit of a smob lives in a special memory region. When the
332 collector encounters a smob, it sets the smob's mark bit, and uses the
333 smob's type tag to find the appropriate @emph{mark} function for that
334 smob. It then calls this @emph{mark} function, passing it the smob as
337 The @emph{mark} function is responsible for marking any other Scheme
338 objects the smob refers to. If it does not do so, the objects' mark
339 bits will still be clear when the collector begins to sweep, and the
340 collector will free them. If this occurs, it will probably break, or at
341 least confuse, any code operating on the smob; the smob's @code{SCM}
342 values will have become dangling references.
344 To mark an arbitrary Scheme object, the @emph{mark} function calls
347 Thus, here is how we might write @code{mark_image}:
352 mark_image (SCM image_smob)
354 /* Mark the image's name and update function. */
355 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
357 scm_gc_mark (image->name);
358 scm_gc_mark (image->update_func);
365 Note that, even though the image's @code{update_func} could be an
366 arbitrarily complex structure (representing a procedure and any values
367 enclosed in its environment), @code{scm_gc_mark} will recurse as
368 necessary to mark all its components. Because @code{scm_gc_mark} sets
369 an object's mark bit before it recurses, it is not confused by
372 As an optimization, the collector will mark whatever value is returned
373 by the @emph{mark} function; this helps limit depth of recursion during
374 the mark phase. Thus, the code above should really be written as:
378 mark_image (SCM image_smob)
380 /* Mark the image's name and update function. */
381 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
383 scm_gc_mark (image->name);
384 return image->update_func;
390 Finally, when the collector encounters an unmarked smob during the sweep
391 phase, it uses the smob's tag to find the appropriate @emph{free}
392 function for the smob. It then calls that function, passing it the smob
393 as its only argument.
395 The @emph{free} function must release any resources used by the smob.
396 However, it must not free objects managed by the collector; the
397 collector will take care of them. For historical reasons, the return
398 type of the @emph{free} function should be @code{size_t}, an unsigned
399 integral type; the @emph{free} function should always return zero.
401 Here is how we might write the @code{free_image} function for the image
405 free_image (SCM image_smob)
407 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
409 scm_gc_free (image->pixels,
410 image->width * image->height,
412 scm_gc_free (image, sizeof (struct image), "image");
418 During the sweep phase, the garbage collector will clear the mark bits
419 on all live objects. The code which implements a smob need not do this
422 There is no way for smob code to be notified when collection is
425 It is usually a good idea to minimize the amount of processing done
426 during garbage collection; keep the @emph{mark} and @emph{free}
427 functions very simple. Since collections occur at unpredictable times,
428 it is easy for any unusual activity to interfere with normal code.
431 @node Garbage Collecting Simple Smobs
432 @subsection Garbage Collecting Simple Smobs
434 It is often useful to define very simple smob types --- smobs which have
435 no data to mark, other than the cell itself, or smobs whose immediate
436 data word is simply an ordinary Scheme object, to be marked recursively.
437 Guile provides some functions to handle these common cases; you can use
438 this function as your smob type's @emph{mark} function, if your smob's
439 structure is simple enough.
441 If the smob refers to no other Scheme objects, then no action is
442 necessary; the garbage collector has already marked the smob cell
443 itself. In that case, you can use zero as your mark function.
445 If the smob refers to exactly one other Scheme object via its first
446 immediate word, you can use @code{scm_markcdr} as its mark function.
447 Its definition is simply:
451 scm_markcdr (SCM obj)
453 return SCM_SMOB_OBJECT (obj);
457 @node Remembering During Operations
458 @subsection Remembering During Operations
461 It's important that a smob is visible to the garbage collector
462 whenever its contents are being accessed. Otherwise it could be freed
463 while code is still using it.
465 For example, consider a procedure to convert image data to a list of
470 image_to_list (SCM image_smob)
476 scm_assert_smob_type (image_tag, image_smob);
478 image = (struct image *) SCM_SMOB_DATA (image_smob);
480 for (i = image->width * image->height - 1; i >= 0; i--)
481 lst = scm_cons (scm_from_char (image->pixels[i]), lst);
483 scm_remember_upto_here_1 (image_smob);
488 In the loop, only the @code{image} pointer is used and the C compiler
489 has no reason to keep the @code{image_smob} value anywhere. If
490 @code{scm_cons} results in a garbage collection, @code{image_smob} might
491 not be on the stack or anywhere else and could be freed, leaving the
492 loop accessing freed data. The use of @code{scm_remember_upto_here_1}
493 prevents this, by creating a reference to @code{image_smob} after all
496 There's no need to do the same for @code{lst}, since that's the return
497 value and the compiler will certainly keep it in a register or
498 somewhere throughout the routine.
500 The @code{clear_image} example previously shown (@pxref{Type checking})
501 also used @code{scm_remember_upto_here_1} for this reason.
503 It's only in quite rare circumstances that a missing
504 @code{scm_remember_upto_here_1} will bite, but when it happens the
505 consequences are serious. Fortunately the rule is simple: whenever
506 calling a Guile library function or doing something that might, ensure
507 that the @code{SCM} of a smob is referenced past all accesses to its
508 insides. Do this by adding an @code{scm_remember_upto_here_1} if
509 there are no other references.
511 In a multi-threaded program, the rule is the same. As far as a given
512 thread is concerned, a garbage collection still only occurs within a
513 Guile library function, not at an arbitrary time. (Guile waits for all
514 threads to reach one of its library functions, and holds them there
515 while the collector runs.)
518 @subsection Double Smobs
520 Smobs are called smob because they are small: they normally have only
521 room for one @code{void*} or @code{SCM} value plus 16 bits. The
522 reason for this is that smobs are directly implemented by using the
523 low-level, two-word cells of Guile that are also used to implement
524 pairs, for example. (@pxref{The Libguile Runtime Environment} for the
525 details.) One word of the two-word cells is used for
526 @code{SCM_SMOB_DATA} (or @code{SCM_SMOB_OBJECT}), the other contains
527 the 16-bit type tag and the 16 extra bits.
529 In addition to the fundamental two-word cells, Guile also has
530 four-word cells, which are appropriately called @dfn{double cells}.
531 You can use them for @dfn{double smobs} and get two more immediate
532 words of type @code{scm_t_bits}.
534 A double smob is created with @code{SCM_NEWSMOB2} or
535 @code{SCM_NEWSMOB3} instead of @code{SCM_NEWSMOB}. Its immediate
536 words can be retrieved as @code{scm_t_bits} with
537 @code{SCM_SMOB_DATA_2} and @code{SCM_SMOB_DATA_3} in addition to
538 @code{SCM_SMOB_DATA}. Unsurprisingly, the words can be set to
539 @code{scm_t_bits} values with @code{SCM_SET_SMOB_DATA_2} and
540 @code{SCM_SET_SMOB_DATA_3}.
542 Of course there are also @code{SCM_SMOB_OBJECT_2},
543 @code{SCM_SMOB_OBJECT_3}, @code{SCM_SET_SMOB_OBJECT_2}, and
544 @code{SCM_SET_SMOB_OBJECT_3}.
546 @node The Complete Example
547 @subsection The Complete Example
549 Here is the complete text of the implementation of the image datatype,
550 as presented in the sections above. We also provide a definition for
551 the smob's @emph{print} function, and make some objects and functions
552 static, to clarify exactly what the surrounding code is using.
554 As mentioned above, you can find this code in the Guile distribution, in
555 @file{doc/example-smob}. That directory includes a makefile and a
556 suitable @code{main} function, so you can build a complete interactive
557 Guile shell, extended with the datatypes described here.)
560 /* file "image-type.c" */
563 #include <libguile.h>
565 static scm_t_bits image_tag;
571 /* The name of this image */
574 /* A function to call when this image is
575 modified, e.g., to update the screen,
576 or SCM_BOOL_F if no action necessary */
581 make_image (SCM name, SCM s_width, SCM s_height)
585 int width = scm_to_int (s_width);
586 int height = scm_to_int (s_height);
588 /* Step 1: Allocate the memory block.
590 image = (struct image *)
591 scm_gc_malloc (sizeof (struct image), "image");
593 /* Step 2: Initialize it with straight code.
595 image->width = width;
596 image->height = height;
597 image->pixels = NULL;
598 image->name = SCM_BOOL_F;
599 image->update_func = SCM_BOOL_F;
601 /* Step 3: Create the smob.
603 SCM_NEWSMOB (smob, image_tag, image);
605 /* Step 4: Finish the initialization.
609 scm_gc_malloc (width * height, "image pixels");
615 clear_image (SCM image_smob)
620 scm_assert_smob_type (image_tag, image_smob);
622 image = (struct image *) SCM_SMOB_DATA (image_smob);
623 area = image->width * image->height;
624 memset (image->pixels, 0, area);
626 /* Invoke the image's update function.
628 if (scm_is_true (image->update_func))
629 scm_call_0 (image->update_func);
631 scm_remember_upto_here_1 (image_smob);
633 return SCM_UNSPECIFIED;
637 mark_image (SCM image_smob)
639 /* Mark the image's name and update function. */
640 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
642 scm_gc_mark (image->name);
643 return image->update_func;
647 free_image (SCM image_smob)
649 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
651 scm_gc_free (image->pixels,
652 image->width * image->height,
654 scm_gc_free (image, sizeof (struct image), "image");
660 print_image (SCM image_smob, SCM port, scm_print_state *pstate)
662 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
664 scm_puts ("#<image ", port);
665 scm_display (image->name, port);
666 scm_puts (">", port);
668 /* non-zero means success */
673 init_image_type (void)
675 image_tag = scm_make_smob_type ("image", sizeof (struct image));
676 scm_set_smob_mark (image_tag, mark_image);
677 scm_set_smob_free (image_tag, free_image);
678 scm_set_smob_print (image_tag, print_image);
680 scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
681 scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
685 Here is a sample build and interaction with the code from the
686 @file{example-smob} directory, on the author's machine:
689 zwingli:example-smob$ make CC=gcc
690 gcc `guile-config compile` -c image-type.c -o image-type.o
691 gcc `guile-config compile` -c myguile.c -o myguile.o
692 gcc image-type.o myguile.o `guile-config link` -o myguile
693 zwingli:example-smob$ ./myguile
695 #<primitive-procedure make-image>
696 guile> (define i (make-image "Whistler's Mother" 100 100))
698 #<image Whistler's Mother>
699 guile> (clear-image i)
700 guile> (clear-image 4)
701 ERROR: In procedure clear-image in expression (clear-image 4):
702 ERROR: Wrong type (expecting image): 4
703 ABORT: (wrong-type-arg)
705 Type "(backtrace)" to get more information.