2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2010, 2011, 2013
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
7 @node Defining New Types (Smobs)
8 @section Defining New Types (Smobs)
10 @dfn{Smobs} are Guile's mechanism for adding new primitive types to
11 the system. The term ``smob'' was coined by Aubrey Jaffer, who says
12 it comes from ``small object'', referring to the fact that they are
13 quite limited in size: they can hold just one pointer to a larger
14 memory block plus 16 extra bits.
16 To define a new smob type, the programmer provides Guile with some
17 essential information about the type --- how to print it, how to
18 garbage collect it, and so on --- and Guile allocates a fresh type tag
19 for it. The programmer can then use @code{scm_c_define_gsubr} to make
20 a set of C functions visible to Scheme code that create and operate on
23 (You can find a complete version of the example code used in this
24 section in the Guile distribution, in @file{doc/example-smob}. That
25 directory includes a makefile and a suitable @code{main} function, so
26 you can build a complete interactive Guile shell, extended with the
27 datatypes described here.)
30 * Describing a New Type::
31 * Creating Smob Instances::
33 * Garbage Collecting Smobs::
34 * Remembering During Operations::
36 * The Complete Example::
39 @node Describing a New Type
40 @subsection Describing a New Type
42 To define a new type, the programmer must write two functions to
43 manage instances of the type:
47 Guile will apply this function to each instance of the new type to print
48 the value, as for @code{display} or @code{write}. The default print
49 function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
50 argument passed to @code{scm_make_smob_type}.
53 If Scheme code asks the @code{equal?} function to compare two instances
54 of the same smob type, Guile calls this function. It should return
55 @code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
56 @code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
57 @code{NULL}, @code{equal?} will assume that two instances of this type are
58 never @code{equal?} unless they are @code{eq?}.
62 When the only resource associated with a smob is memory managed by the
63 garbage collector---i.e., memory allocated with the @code{scm_gc_malloc}
64 functions---this is sufficient. However, when a smob is associated with
65 other kinds of resources, it may be necessary to define one of the
66 following functions, or both:
70 Guile will apply this function to each instance of the new type it
71 encounters during garbage collection. This function is responsible for
72 telling the collector about any other @code{SCM} values that the object
73 has stored, and that are in memory regions not already scanned by the
74 garbage collector. @xref{Garbage Collecting Smobs}, for more details.
77 Guile will apply this function to each instance of the new type that is
78 to be deallocated. The function should release all resources held by
79 the object. This is analogous to the Java finalization method---it is
80 invoked at an unspecified time (when garbage collection occurs) after
81 the object is dead. @xref{Garbage Collecting Smobs}, for more details.
83 This function operates while the heap is in an inconsistent state and
84 must therefore be careful. @xref{Smobs}, for details about what this
85 function is allowed to do.
88 To actually register the new smob type, call @code{scm_make_smob_type}.
89 It returns a value of type @code{scm_t_bits} which identifies the new
92 The four special functions described above are registered by calling
93 one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
94 @code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
95 appropriate. Each function is intended to be used at most once per
96 type, and the call should be placed immediately following the call to
97 @code{scm_make_smob_type}.
99 There can only be at most 256 different smob types in the system.
100 Instead of registering a huge number of smob types (for example, one
101 for each relevant C struct in your application), it is sometimes
102 better to register just one and implement a second layer of type
103 dispatching on top of it. This second layer might use the 16 extra
104 bits to extend its type, for example.
106 Here is how one might declare and register a new type representing
107 eight-bit gray-scale images:
110 #include <libguile.h>
116 /* The name of this image */
119 /* A function to call when this image is
120 modified, e.g., to update the screen,
121 or SCM_BOOL_F if no action necessary */
125 static scm_t_bits image_tag;
128 init_image_type (void)
130 image_tag = scm_make_smob_type ("image", sizeof (struct image));
131 scm_set_smob_mark (image_tag, mark_image);
132 scm_set_smob_free (image_tag, free_image);
133 scm_set_smob_print (image_tag, print_image);
138 @node Creating Smob Instances
139 @subsection Creating Smob Instances
141 Normally, smobs can have one @emph{immediate} word of data. This word
142 stores either a pointer to an additional memory block that holds the
143 real data, or it might hold the data itself when it fits. The word is
144 large enough for a @code{SCM} value, a pointer to @code{void}, or an
145 integer that fits into a @code{size_t} or @code{ssize_t}.
147 You can also create smobs that have two or three immediate words, and
148 when these words suffice to store all data, it is more efficient to use
149 these super-sized smobs instead of using a normal smob plus a memory
150 block. @xref{Double Smobs}, for their discussion.
152 Guile provides functions for managing memory which are often helpful
153 when implementing smobs. @xref{Memory Blocks}.
155 To retrieve the immediate word of a smob, you use the macro
156 @code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
157 The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
158 @code{SCM_SET_SMOB_FLAGS}.
160 The two macros @code{SCM_SMOB_DATA} and @code{SCM_SET_SMOB_DATA} treat
161 the immediate word as if it were of type @code{scm_t_bits}, which is
162 an unsigned integer type large enough to hold a pointer to
163 @code{void}. Thus you can use these macros to store arbitrary
164 pointers in the smob word.
166 When you want to store a @code{SCM} value directly in the immediate
167 word of a smob, you should use the macros @code{SCM_SMOB_OBJECT} and
168 @code{SCM_SET_SMOB_OBJECT} to access it.
170 Creating a smob instance can be tricky when it consists of multiple
171 steps that allocate resources. Most of the time, this is mainly about
172 allocating memory to hold associated data structures. Using memory
173 managed by the garbage collector simplifies things: the garbage
174 collector will automatically scan those data structures for pointers,
175 and reclaim them when they are no longer referenced.
177 Continuing the example from above, if the global variable
178 @code{image_tag} contains a tag returned by @code{scm_make_smob_type},
179 here is how we could construct a smob whose immediate word contains a
180 pointer to a freshly allocated @code{struct image}:
184 make_image (SCM name, SCM s_width, SCM s_height)
188 int width = scm_to_int (s_width);
189 int height = scm_to_int (s_height);
191 /* Step 1: Allocate the memory block.
193 image = (struct image *)
194 scm_gc_malloc (sizeof (struct image), "image");
196 /* Step 2: Initialize it with straight code.
198 image->width = width;
199 image->height = height;
200 image->pixels = NULL;
201 image->name = SCM_BOOL_F;
202 image->update_func = SCM_BOOL_F;
204 /* Step 3: Create the smob.
206 smob = scm_new_smob (image_tag, image);
208 /* Step 4: Finish the initialization.
212 scm_gc_malloc_pointerless (width * height, "image pixels");
218 We use @code{scm_gc_malloc_pointerless} for the pixel buffer to tell the
219 garbage collector not to scan it for pointers. Calls to
220 @code{scm_gc_malloc}, @code{scm_new_smob}, and
221 @code{scm_gc_malloc_pointerless} raise an exception in out-of-memory
222 conditions; the garbage collector is able to reclaim previously
223 allocated memory if that happens.
227 @subsection Type checking
229 Functions that operate on smobs should check that the passed
230 @code{SCM} value indeed is a suitable smob before accessing its data.
231 They can do this with @code{scm_assert_smob_type}.
233 For example, here is a simple function that operates on an image smob,
234 and checks the type of its argument.
238 clear_image (SCM image_smob)
243 scm_assert_smob_type (image_tag, image_smob);
245 image = (struct image *) SCM_SMOB_DATA (image_smob);
246 area = image->width * image->height;
247 memset (image->pixels, 0, area);
249 /* Invoke the image's update function.
251 if (scm_is_true (image->update_func))
252 scm_call_0 (image->update_func);
254 scm_remember_upto_here_1 (image_smob);
256 return SCM_UNSPECIFIED;
260 See @ref{Remembering During Operations} for an explanation of the call
261 to @code{scm_remember_upto_here_1}.
264 @node Garbage Collecting Smobs
265 @subsection Garbage Collecting Smobs
267 Once a smob has been released to the tender mercies of the Scheme
268 system, it must be prepared to survive garbage collection. In the
269 example above, all the memory associated with the smob is managed by the
270 garbage collector because we used the @code{scm_gc_} allocation
271 functions. Thus, no special care must be taken: the garbage collector
272 automatically scans them and reclaims any unused memory.
274 However, when data associated with a smob is managed in some other
275 way---e.g., @code{malloc}'d memory or file descriptors---it is possible
276 to specify a @emph{free} function to release those resources when the
277 smob is reclaimed, and a @emph{mark} function to mark Scheme objects
278 otherwise invisible to the garbage collector.
280 As described in more detail elsewhere (@pxref{Conservative GC}), every
281 object in the Scheme system has a @dfn{mark bit}, which the garbage
282 collector uses to tell live objects from dead ones. When collection
283 starts, every object's mark bit is clear. The collector traces pointers
284 through the heap, starting from objects known to be live, and sets the
285 mark bit on each object it encounters. When it can find no more
286 unmarked objects, the collector walks all objects, live and dead, frees
287 those whose mark bits are still clear, and clears the mark bit on the
290 The two main portions of the collection are called the @dfn{mark phase},
291 during which the collector marks live objects, and the @dfn{sweep
292 phase}, during which the collector frees all unmarked objects.
294 The mark bit of a smob lives in a special memory region. When the
295 collector encounters a smob, it sets the smob's mark bit, and uses the
296 smob's type tag to find the appropriate @emph{mark} function for that
297 smob. It then calls this @emph{mark} function, passing it the smob as
300 The @emph{mark} function is responsible for marking any other Scheme
301 objects the smob refers to. If it does not do so, the objects' mark
302 bits will still be clear when the collector begins to sweep, and the
303 collector will free them. If this occurs, it will probably break, or at
304 least confuse, any code operating on the smob; the smob's @code{SCM}
305 values will have become dangling references.
307 To mark an arbitrary Scheme object, the @emph{mark} function calls
310 Thus, here is how we might write @code{mark_image}---again this is not
311 needed in our example since we used the @code{scm_gc_} allocation
312 routines, so this is just for the sake of illustration:
317 mark_image (SCM image_smob)
319 /* Mark the image's name and update function. */
320 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
322 scm_gc_mark (image->name);
323 scm_gc_mark (image->update_func);
330 Note that, even though the image's @code{update_func} could be an
331 arbitrarily complex structure (representing a procedure and any values
332 enclosed in its environment), @code{scm_gc_mark} will recurse as
333 necessary to mark all its components. Because @code{scm_gc_mark} sets
334 an object's mark bit before it recurses, it is not confused by
337 As an optimization, the collector will mark whatever value is returned
338 by the @emph{mark} function; this helps limit depth of recursion during
339 the mark phase. Thus, the code above should really be written as:
343 mark_image (SCM image_smob)
345 /* Mark the image's name and update function. */
346 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
348 scm_gc_mark (image->name);
349 return image->update_func;
355 Finally, when the collector encounters an unmarked smob during the sweep
356 phase, it uses the smob's tag to find the appropriate @emph{free}
357 function for the smob. It then calls that function, passing it the smob
358 as its only argument.
360 The @emph{free} function must release any resources used by the smob.
361 However, it must not free objects managed by the collector; the
362 collector will take care of them. For historical reasons, the return
363 type of the @emph{free} function should be @code{size_t}, an unsigned
364 integral type; the @emph{free} function should always return zero.
366 Here is how we might write the @code{free_image} function for the image
367 smob type---again for the sake of illustration, since our example does
368 not need it thanks to the use of the @code{scm_gc_} allocation routines:
371 free_image (SCM image_smob)
373 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
375 scm_gc_free (image->pixels,
376 image->width * image->height,
378 scm_gc_free (image, sizeof (struct image), "image");
384 During the sweep phase, the garbage collector will clear the mark bits
385 on all live objects. The code which implements a smob need not do this
388 There is no way for smob code to be notified when collection is
391 It is usually a good idea to minimize the amount of processing done
392 during garbage collection; keep the @emph{mark} and @emph{free}
393 functions very simple. Since collections occur at unpredictable times,
394 it is easy for any unusual activity to interfere with normal code.
396 @node Remembering During Operations
397 @subsection Remembering During Operations
400 @c FIXME: Remove this section?
402 It's important that a smob is visible to the garbage collector
403 whenever its contents are being accessed. Otherwise it could be freed
404 while code is still using it.
406 For example, consider a procedure to convert image data to a list of
411 image_to_list (SCM image_smob)
417 scm_assert_smob_type (image_tag, image_smob);
419 image = (struct image *) SCM_SMOB_DATA (image_smob);
421 for (i = image->width * image->height - 1; i >= 0; i--)
422 lst = scm_cons (scm_from_char (image->pixels[i]), lst);
424 scm_remember_upto_here_1 (image_smob);
429 In the loop, only the @code{image} pointer is used and the C compiler
430 has no reason to keep the @code{image_smob} value anywhere. If
431 @code{scm_cons} results in a garbage collection, @code{image_smob} might
432 not be on the stack or anywhere else and could be freed, leaving the
433 loop accessing freed data. The use of @code{scm_remember_upto_here_1}
434 prevents this, by creating a reference to @code{image_smob} after all
437 There's no need to do the same for @code{lst}, since that's the return
438 value and the compiler will certainly keep it in a register or
439 somewhere throughout the routine.
441 The @code{clear_image} example previously shown (@pxref{Type checking})
442 also used @code{scm_remember_upto_here_1} for this reason.
444 It's only in quite rare circumstances that a missing
445 @code{scm_remember_upto_here_1} will bite, but when it happens the
446 consequences are serious. Fortunately the rule is simple: whenever
447 calling a Guile library function or doing something that might, ensure
448 that the @code{SCM} of a smob is referenced past all accesses to its
449 insides. Do this by adding an @code{scm_remember_upto_here_1} if
450 there are no other references.
452 In a multi-threaded program, the rule is the same. As far as a given
453 thread is concerned, a garbage collection still only occurs within a
454 Guile library function, not at an arbitrary time. (Guile waits for all
455 threads to reach one of its library functions, and holds them there
456 while the collector runs.)
459 @subsection Double Smobs
461 @c FIXME: Remove this section?
463 Smobs are called smob because they are small: they normally have only
464 room for one @code{void*} or @code{SCM} value plus 16 bits. The
465 reason for this is that smobs are directly implemented by using the
466 low-level, two-word cells of Guile that are also used to implement
467 pairs, for example. (@pxref{Data Representation} for the
468 details.) One word of the two-word cells is used for
469 @code{SCM_SMOB_DATA} (or @code{SCM_SMOB_OBJECT}), the other contains
470 the 16-bit type tag and the 16 extra bits.
472 In addition to the fundamental two-word cells, Guile also has
473 four-word cells, which are appropriately called @dfn{double cells}.
474 You can use them for @dfn{double smobs} and get two more immediate
475 words of type @code{scm_t_bits}.
477 A double smob is created with @code{scm_new_double_smob}. Its immediate
478 words can be retrieved as @code{scm_t_bits} with @code{SCM_SMOB_DATA_2}
479 and @code{SCM_SMOB_DATA_3} in addition to @code{SCM_SMOB_DATA}.
480 Unsurprisingly, the words can be set to @code{scm_t_bits} values with
481 @code{SCM_SET_SMOB_DATA_2} and @code{SCM_SET_SMOB_DATA_3}.
483 Of course there are also @code{SCM_SMOB_OBJECT_2},
484 @code{SCM_SMOB_OBJECT_3}, @code{SCM_SET_SMOB_OBJECT_2}, and
485 @code{SCM_SET_SMOB_OBJECT_3}.
487 @node The Complete Example
488 @subsection The Complete Example
490 Here is the complete text of the implementation of the image datatype,
491 as presented in the sections above. We also provide a definition for
492 the smob's @emph{print} function, and make some objects and functions
493 static, to clarify exactly what the surrounding code is using.
495 As mentioned above, you can find this code in the Guile distribution, in
496 @file{doc/example-smob}. That directory includes a makefile and a
497 suitable @code{main} function, so you can build a complete interactive
498 Guile shell, extended with the datatypes described here.)
501 /* file "image-type.c" */
504 #include <libguile.h>
506 static scm_t_bits image_tag;
512 /* The name of this image */
515 /* A function to call when this image is
516 modified, e.g., to update the screen,
517 or SCM_BOOL_F if no action necessary */
522 make_image (SCM name, SCM s_width, SCM s_height)
526 int width = scm_to_int (s_width);
527 int height = scm_to_int (s_height);
529 /* Step 1: Allocate the memory block.
531 image = (struct image *)
532 scm_gc_malloc (sizeof (struct image), "image");
534 /* Step 2: Initialize it with straight code.
536 image->width = width;
537 image->height = height;
538 image->pixels = NULL;
539 image->name = SCM_BOOL_F;
540 image->update_func = SCM_BOOL_F;
542 /* Step 3: Create the smob.
544 smob = scm_new_smob (image_tag, image);
546 /* Step 4: Finish the initialization.
550 scm_gc_malloc (width * height, "image pixels");
556 clear_image (SCM image_smob)
561 scm_assert_smob_type (image_tag, image_smob);
563 image = (struct image *) SCM_SMOB_DATA (image_smob);
564 area = image->width * image->height;
565 memset (image->pixels, 0, area);
567 /* Invoke the image's update function.
569 if (scm_is_true (image->update_func))
570 scm_call_0 (image->update_func);
572 scm_remember_upto_here_1 (image_smob);
574 return SCM_UNSPECIFIED;
578 mark_image (SCM image_smob)
580 /* Mark the image's name and update function. */
581 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
583 scm_gc_mark (image->name);
584 return image->update_func;
588 free_image (SCM image_smob)
590 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
592 scm_gc_free (image->pixels,
593 image->width * image->height,
595 scm_gc_free (image, sizeof (struct image), "image");
601 print_image (SCM image_smob, SCM port, scm_print_state *pstate)
603 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
605 scm_puts ("#<image ", port);
606 scm_display (image->name, port);
607 scm_puts (">", port);
609 /* non-zero means success */
614 init_image_type (void)
616 image_tag = scm_make_smob_type ("image", sizeof (struct image));
617 scm_set_smob_mark (image_tag, mark_image);
618 scm_set_smob_free (image_tag, free_image);
619 scm_set_smob_print (image_tag, print_image);
621 scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
622 scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
626 Here is a sample build and interaction with the code from the
627 @file{example-smob} directory, on the author's machine:
630 zwingli:example-smob$ make CC=gcc
631 gcc `pkg-config --cflags guile-@value{EFFECTIVE-VERSION}` -c image-type.c -o image-type.o
632 gcc `pkg-config --cflags guile-@value{EFFECTIVE-VERSION}` -c myguile.c -o myguile.o
633 gcc image-type.o myguile.o `pkg-config --libs guile-@value{EFFECTIVE-VERSION}` -o myguile
634 zwingli:example-smob$ ./myguile
636 #<primitive-procedure make-image>
637 guile> (define i (make-image "Whistler's Mother" 100 100))
639 #<image Whistler's Mother>
640 guile> (clear-image i)
641 guile> (clear-image 4)
642 ERROR: In procedure clear-image in expression (clear-image 4):
643 ERROR: Wrong type (expecting image): 4
644 ABORT: (wrong-type-arg)
646 Type "(backtrace)" to get more information.