Merge commit '58147d67806e1f54c447d7eabac35b1a5086c3a6'
[bpt/guile.git] / doc / ref / libguile-smobs.texi
CommitLineData
3229f68b
MV
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
5b70b4e2 3@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2010, 2011, 2013
3229f68b
MV
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@node Defining New Types (Smobs)
8@section Defining New Types (Smobs)
9
10@dfn{Smobs} are Guile's mechanism for adding new primitive types to
11the system. The term ``smob'' was coined by Aubrey Jaffer, who says
12it comes from ``small object'', referring to the fact that they are
13quite limited in size: they can hold just one pointer to a larger
14memory block plus 16 extra bits.
15
16To define a new smob type, the programmer provides Guile with some
17essential information about the type --- how to print it, how to
18garbage collect it, and so on --- and Guile allocates a fresh type tag
19for it. The programmer can then use @code{scm_c_define_gsubr} to make
20a set of C functions visible to Scheme code that create and operate on
21these objects.
22
23(You can find a complete version of the example code used in this
24section in the Guile distribution, in @file{doc/example-smob}. That
25directory includes a makefile and a suitable @code{main} function, so
26you can build a complete interactive Guile shell, extended with the
27datatypes described here.)
28
29@menu
30* Describing a New Type::
eb12b401 31* Creating Smob Instances::
3229f68b
MV
32* Type checking::
33* Garbage Collecting Smobs::
3229f68b
MV
34* Remembering During Operations::
35* Double Smobs::
36* The Complete Example::
37@end menu
38
39@node Describing a New Type
40@subsection Describing a New Type
41
aaa9ef33 42To define a new type, the programmer must write two functions to
3229f68b
MV
43manage instances of the type:
44
45@table @code
3229f68b
MV
46@item print
47Guile will apply this function to each instance of the new type to print
48the value, as for @code{display} or @code{write}. The default print
49function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
0f7e6c56 50argument passed to @code{scm_make_smob_type}.
3229f68b
MV
51
52@item equalp
53If Scheme code asks the @code{equal?} function to compare two instances
54of the same smob type, Guile calls this function. It should return
55@code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
56@code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
57@code{NULL}, @code{equal?} will assume that two instances of this type are
58never @code{equal?} unless they are @code{eq?}.
59
60@end table
61
aaa9ef33
LC
62When the only resource associated with a smob is memory managed by the
63garbage collector---i.e., memory allocated with the @code{scm_gc_malloc}
64functions---this is sufficient. However, when a smob is associated with
65other kinds of resources, it may be necessary to define one of the
66following functions, or both:
67
68@table @code
69@item mark
70Guile will apply this function to each instance of the new type it
71encounters during garbage collection. This function is responsible for
72telling the collector about any other @code{SCM} values that the object
73has stored, and that are in memory regions not already scanned by the
74garbage collector. @xref{Garbage Collecting Smobs}, for more details.
75
76@item free
77Guile will apply this function to each instance of the new type that is
78to be deallocated. The function should release all resources held by
79the object. This is analogous to the Java finalization method---it is
80invoked at an unspecified time (when garbage collection occurs) after
81the object is dead. @xref{Garbage Collecting Smobs}, for more details.
82
83This function operates while the heap is in an inconsistent state and
84must therefore be careful. @xref{Smobs}, for details about what this
85function is allowed to do.
86@end table
87
3229f68b
MV
88To actually register the new smob type, call @code{scm_make_smob_type}.
89It returns a value of type @code{scm_t_bits} which identifies the new
90smob type.
91
818d24b5 92The four special functions described above are registered by calling
3229f68b
MV
93one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
94@code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
95appropriate. Each function is intended to be used at most once per
96type, and the call should be placed immediately following the call to
97@code{scm_make_smob_type}.
98
99There can only be at most 256 different smob types in the system.
100Instead of registering a huge number of smob types (for example, one
101for each relevant C struct in your application), it is sometimes
8c3fa3e5 102better to register just one and implement a second layer of type
3229f68b 103dispatching on top of it. This second layer might use the 16 extra
8c3fa3e5 104bits to extend its type, for example.
3229f68b
MV
105
106Here is how one might declare and register a new type representing
107eight-bit gray-scale images:
108
109@example
110#include <libguile.h>
111
112struct image @{
113 int width, height;
114 char *pixels;
115
116 /* The name of this image */
117 SCM name;
118
119 /* A function to call when this image is
120 modified, e.g., to update the screen,
121 or SCM_BOOL_F if no action necessary */
122 SCM update_func;
123@};
124
125static scm_t_bits image_tag;
126
127void
128init_image_type (void)
129@{
130 image_tag = scm_make_smob_type ("image", sizeof (struct image));
131 scm_set_smob_mark (image_tag, mark_image);
132 scm_set_smob_free (image_tag, free_image);
133 scm_set_smob_print (image_tag, print_image);
134@}
135@end example
136
137
eb12b401
NJ
138@node Creating Smob Instances
139@subsection Creating Smob Instances
3229f68b 140
818d24b5
MV
141Normally, smobs can have one @emph{immediate} word of data. This word
142stores either a pointer to an additional memory block that holds the
143real data, or it might hold the data itself when it fits. The word is
144large enough for a @code{SCM} value, a pointer to @code{void}, or an
145integer that fits into a @code{size_t} or @code{ssize_t}.
3229f68b
MV
146
147You can also create smobs that have two or three immediate words, and
148when these words suffice to store all data, it is more efficient to use
149these super-sized smobs instead of using a normal smob plus a memory
150block. @xref{Double Smobs}, for their discussion.
151
fc038e5b
MV
152Guile provides functions for managing memory which are often helpful
153when implementing smobs. @xref{Memory Blocks}.
154
3229f68b
MV
155To retrieve the immediate word of a smob, you use the macro
156@code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
157The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
158@code{SCM_SET_SMOB_FLAGS}.
159
818d24b5 160The two macros @code{SCM_SMOB_DATA} and @code{SCM_SET_SMOB_DATA} treat
fc038e5b
MV
161the immediate word as if it were of type @code{scm_t_bits}, which is
162an unsigned integer type large enough to hold a pointer to
163@code{void}. Thus you can use these macros to store arbitrary
164pointers in the smob word.
165
166When you want to store a @code{SCM} value directly in the immediate
167word of a smob, you should use the macros @code{SCM_SMOB_OBJECT} and
168@code{SCM_SET_SMOB_OBJECT} to access it.
3229f68b
MV
169
170Creating a smob instance can be tricky when it consists of multiple
aaa9ef33
LC
171steps that allocate resources. Most of the time, this is mainly about
172allocating memory to hold associated data structures. Using memory
173managed by the garbage collector simplifies things: the garbage
174collector will automatically scan those data structures for pointers,
175and reclaim them when they are no longer referenced.
3229f68b
MV
176
177Continuing the example from above, if the global variable
178@code{image_tag} contains a tag returned by @code{scm_make_smob_type},
179here is how we could construct a smob whose immediate word contains a
180pointer to a freshly allocated @code{struct image}:
181
182@example
183SCM
184make_image (SCM name, SCM s_width, SCM s_height)
185@{
186 SCM smob;
187 struct image *image;
188 int width = scm_to_int (s_width);
189 int height = scm_to_int (s_height);
190
191 /* Step 1: Allocate the memory block.
192 */
45867c2a
NJ
193 image = (struct image *)
194 scm_gc_malloc (sizeof (struct image), "image");
3229f68b
MV
195
196 /* Step 2: Initialize it with straight code.
197 */
198 image->width = width;
199 image->height = height;
200 image->pixels = NULL;
201 image->name = SCM_BOOL_F;
202 image->update_func = SCM_BOOL_F;
203
204 /* Step 3: Create the smob.
205 */
5b70b4e2 206 smob = scm_new_smob (image_tag, image);
3229f68b
MV
207
208 /* Step 4: Finish the initialization.
209 */
210 image->name = name;
45867c2a 211 image->pixels =
aaa9ef33 212 scm_gc_malloc_pointerless (width * height, "image pixels");
3229f68b
MV
213
214 return smob;
215@}
216@end example
217
aaa9ef33
LC
218We use @code{scm_gc_malloc_pointerless} for the pixel buffer to tell the
219garbage collector not to scan it for pointers. Calls to
220@code{scm_gc_malloc}, @code{scm_new_smob}, and
221@code{scm_gc_malloc_pointerless} raise an exception in out-of-memory
222conditions; the garbage collector is able to reclaim previously
223allocated memory if that happens.
3229f68b 224
3229f68b
MV
225
226@node Type checking
227@subsection Type checking
228
818d24b5
MV
229Functions that operate on smobs should check that the passed
230@code{SCM} value indeed is a suitable smob before accessing its data.
231They can do this with @code{scm_assert_smob_type}.
3229f68b
MV
232
233For example, here is a simple function that operates on an image smob,
234and checks the type of its argument.
235
236@example
237SCM
238clear_image (SCM image_smob)
239@{
240 int area;
241 struct image *image;
242
818d24b5 243 scm_assert_smob_type (image_tag, image_smob);
3229f68b
MV
244
245 image = (struct image *) SCM_SMOB_DATA (image_smob);
246 area = image->width * image->height;
247 memset (image->pixels, 0, area);
248
249 /* Invoke the image's update function.
250 */
251 if (scm_is_true (image->update_func))
252 scm_call_0 (image->update_func);
253
254 scm_remember_upto_here_1 (image_smob);
255
256 return SCM_UNSPECIFIED;
257@}
258@end example
259
260See @ref{Remembering During Operations} for an explanation of the call
261to @code{scm_remember_upto_here_1}.
262
263
264@node Garbage Collecting Smobs
265@subsection Garbage Collecting Smobs
266
267Once a smob has been released to the tender mercies of the Scheme
aaa9ef33
LC
268system, it must be prepared to survive garbage collection. In the
269example above, all the memory associated with the smob is managed by the
270garbage collector because we used the @code{scm_gc_} allocation
271functions. Thus, no special care must be taken: the garbage collector
272automatically scans them and reclaims any unused memory.
273
274However, when data associated with a smob is managed in some other
275way---e.g., @code{malloc}'d memory or file descriptors---it is possible
276to specify a @emph{free} function to release those resources when the
277smob is reclaimed, and a @emph{mark} function to mark Scheme objects
278otherwise invisible to the garbage collector.
3229f68b
MV
279
280As described in more detail elsewhere (@pxref{Conservative GC}), every
281object in the Scheme system has a @dfn{mark bit}, which the garbage
282collector uses to tell live objects from dead ones. When collection
283starts, every object's mark bit is clear. The collector traces pointers
284through the heap, starting from objects known to be live, and sets the
285mark bit on each object it encounters. When it can find no more
286unmarked objects, the collector walks all objects, live and dead, frees
287those whose mark bits are still clear, and clears the mark bit on the
288others.
289
290The two main portions of the collection are called the @dfn{mark phase},
291during which the collector marks live objects, and the @dfn{sweep
292phase}, during which the collector frees all unmarked objects.
293
294The mark bit of a smob lives in a special memory region. When the
295collector encounters a smob, it sets the smob's mark bit, and uses the
296smob's type tag to find the appropriate @emph{mark} function for that
297smob. It then calls this @emph{mark} function, passing it the smob as
298its only argument.
299
300The @emph{mark} function is responsible for marking any other Scheme
301objects the smob refers to. If it does not do so, the objects' mark
302bits will still be clear when the collector begins to sweep, and the
303collector will free them. If this occurs, it will probably break, or at
304least confuse, any code operating on the smob; the smob's @code{SCM}
305values will have become dangling references.
306
307To mark an arbitrary Scheme object, the @emph{mark} function calls
308@code{scm_gc_mark}.
309
aaa9ef33
LC
310Thus, here is how we might write @code{mark_image}---again this is not
311needed in our example since we used the @code{scm_gc_} allocation
312routines, so this is just for the sake of illustration:
3229f68b
MV
313
314@example
315@group
316SCM
317mark_image (SCM image_smob)
318@{
319 /* Mark the image's name and update function. */
320 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
321
322 scm_gc_mark (image->name);
323 scm_gc_mark (image->update_func);
324
325 return SCM_BOOL_F;
326@}
327@end group
328@end example
329
330Note that, even though the image's @code{update_func} could be an
331arbitrarily complex structure (representing a procedure and any values
332enclosed in its environment), @code{scm_gc_mark} will recurse as
333necessary to mark all its components. Because @code{scm_gc_mark} sets
334an object's mark bit before it recurses, it is not confused by
335circular structures.
336
337As an optimization, the collector will mark whatever value is returned
338by the @emph{mark} function; this helps limit depth of recursion during
339the mark phase. Thus, the code above should really be written as:
340@example
341@group
342SCM
343mark_image (SCM image_smob)
344@{
345 /* Mark the image's name and update function. */
346 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
347
348 scm_gc_mark (image->name);
349 return image->update_func;
350@}
351@end group
352@end example
353
354
355Finally, when the collector encounters an unmarked smob during the sweep
356phase, it uses the smob's tag to find the appropriate @emph{free}
357function for the smob. It then calls that function, passing it the smob
358as its only argument.
359
360The @emph{free} function must release any resources used by the smob.
361However, it must not free objects managed by the collector; the
362collector will take care of them. For historical reasons, the return
363type of the @emph{free} function should be @code{size_t}, an unsigned
364integral type; the @emph{free} function should always return zero.
365
366Here is how we might write the @code{free_image} function for the image
aaa9ef33
LC
367smob type---again for the sake of illustration, since our example does
368not need it thanks to the use of the @code{scm_gc_} allocation routines:
3229f68b
MV
369@example
370size_t
371free_image (SCM image_smob)
372@{
373 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
374
45867c2a
NJ
375 scm_gc_free (image->pixels,
376 image->width * image->height,
377 "image pixels");
3229f68b
MV
378 scm_gc_free (image, sizeof (struct image), "image");
379
380 return 0;
381@}
382@end example
383
384During the sweep phase, the garbage collector will clear the mark bits
385on all live objects. The code which implements a smob need not do this
386itself.
387
388There is no way for smob code to be notified when collection is
389complete.
390
391It is usually a good idea to minimize the amount of processing done
392during garbage collection; keep the @emph{mark} and @emph{free}
393functions very simple. Since collections occur at unpredictable times,
394it is easy for any unusual activity to interfere with normal code.
395
3229f68b
MV
396@node Remembering During Operations
397@subsection Remembering During Operations
1176df85 398@cindex remembering
3229f68b 399
aaa9ef33
LC
400@c FIXME: Remove this section?
401
3229f68b
MV
402It's important that a smob is visible to the garbage collector
403whenever its contents are being accessed. Otherwise it could be freed
404while code is still using it.
405
406For example, consider a procedure to convert image data to a list of
407pixel values.
408
409@example
410SCM
411image_to_list (SCM image_smob)
412@{
413 struct image *image;
414 SCM lst;
415 int i;
818d24b5
MV
416
417 scm_assert_smob_type (image_tag, image_smob);
3229f68b
MV
418
419 image = (struct image *) SCM_SMOB_DATA (image_smob);
420 lst = SCM_EOL;
421 for (i = image->width * image->height - 1; i >= 0; i--)
422 lst = scm_cons (scm_from_char (image->pixels[i]), lst);
423
424 scm_remember_upto_here_1 (image_smob);
425 return lst;
426@}
427@end example
428
429In the loop, only the @code{image} pointer is used and the C compiler
430has no reason to keep the @code{image_smob} value anywhere. If
431@code{scm_cons} results in a garbage collection, @code{image_smob} might
432not be on the stack or anywhere else and could be freed, leaving the
433loop accessing freed data. The use of @code{scm_remember_upto_here_1}
434prevents this, by creating a reference to @code{image_smob} after all
435data accesses.
436
437There's no need to do the same for @code{lst}, since that's the return
438value and the compiler will certainly keep it in a register or
439somewhere throughout the routine.
440
441The @code{clear_image} example previously shown (@pxref{Type checking})
442also used @code{scm_remember_upto_here_1} for this reason.
443
444It's only in quite rare circumstances that a missing
445@code{scm_remember_upto_here_1} will bite, but when it happens the
446consequences are serious. Fortunately the rule is simple: whenever
447calling a Guile library function or doing something that might, ensure
fc038e5b 448that the @code{SCM} of a smob is referenced past all accesses to its
3229f68b
MV
449insides. Do this by adding an @code{scm_remember_upto_here_1} if
450there are no other references.
451
452In a multi-threaded program, the rule is the same. As far as a given
453thread is concerned, a garbage collection still only occurs within a
454Guile library function, not at an arbitrary time. (Guile waits for all
455threads to reach one of its library functions, and holds them there
456while the collector runs.)
457
458@node Double Smobs
459@subsection Double Smobs
460
aaa9ef33
LC
461@c FIXME: Remove this section?
462
3229f68b 463Smobs are called smob because they are small: they normally have only
fc038e5b
MV
464room for one @code{void*} or @code{SCM} value plus 16 bits. The
465reason for this is that smobs are directly implemented by using the
466low-level, two-word cells of Guile that are also used to implement
0f7e6c56 467pairs, for example. (@pxref{Data Representation} for the
8680d53b
AW
468details.) One word of the two-word cells is used for
469@code{SCM_SMOB_DATA} (or @code{SCM_SMOB_OBJECT}), the other contains
470the 16-bit type tag and the 16 extra bits.
3229f68b
MV
471
472In addition to the fundamental two-word cells, Guile also has
473four-word cells, which are appropriately called @dfn{double cells}.
474You can use them for @dfn{double smobs} and get two more immediate
475words of type @code{scm_t_bits}.
476
5b70b4e2
AW
477A double smob is created with @code{scm_new_double_smob}. Its immediate
478words can be retrieved as @code{scm_t_bits} with @code{SCM_SMOB_DATA_2}
479and @code{SCM_SMOB_DATA_3} in addition to @code{SCM_SMOB_DATA}.
480Unsurprisingly, the words can be set to @code{scm_t_bits} values with
481@code{SCM_SET_SMOB_DATA_2} and @code{SCM_SET_SMOB_DATA_3}.
fc038e5b
MV
482
483Of course there are also @code{SCM_SMOB_OBJECT_2},
484@code{SCM_SMOB_OBJECT_3}, @code{SCM_SET_SMOB_OBJECT_2}, and
485@code{SCM_SET_SMOB_OBJECT_3}.
3229f68b
MV
486
487@node The Complete Example
488@subsection The Complete Example
489
490Here is the complete text of the implementation of the image datatype,
491as presented in the sections above. We also provide a definition for
492the smob's @emph{print} function, and make some objects and functions
493static, to clarify exactly what the surrounding code is using.
494
495As mentioned above, you can find this code in the Guile distribution, in
496@file{doc/example-smob}. That directory includes a makefile and a
497suitable @code{main} function, so you can build a complete interactive
498Guile shell, extended with the datatypes described here.)
499
500@example
501/* file "image-type.c" */
502
503#include <stdlib.h>
504#include <libguile.h>
505
506static scm_t_bits image_tag;
507
508struct image @{
509 int width, height;
510 char *pixels;
511
512 /* The name of this image */
513 SCM name;
514
515 /* A function to call when this image is
516 modified, e.g., to update the screen,
517 or SCM_BOOL_F if no action necessary */
518 SCM update_func;
519@};
520
521static SCM
522make_image (SCM name, SCM s_width, SCM s_height)
523@{
524 SCM smob;
525 struct image *image;
526 int width = scm_to_int (s_width);
527 int height = scm_to_int (s_height);
528
529 /* Step 1: Allocate the memory block.
530 */
45867c2a
NJ
531 image = (struct image *)
532 scm_gc_malloc (sizeof (struct image), "image");
3229f68b
MV
533
534 /* Step 2: Initialize it with straight code.
535 */
536 image->width = width;
537 image->height = height;
538 image->pixels = NULL;
539 image->name = SCM_BOOL_F;
540 image->update_func = SCM_BOOL_F;
541
542 /* Step 3: Create the smob.
543 */
5b70b4e2 544 smob = scm_new_smob (image_tag, image);
3229f68b
MV
545
546 /* Step 4: Finish the initialization.
547 */
548 image->name = name;
45867c2a
NJ
549 image->pixels =
550 scm_gc_malloc (width * height, "image pixels");
3229f68b
MV
551
552 return smob;
553@}
554
555SCM
556clear_image (SCM image_smob)
557@{
558 int area;
559 struct image *image;
560
818d24b5 561 scm_assert_smob_type (image_tag, image_smob);
3229f68b
MV
562
563 image = (struct image *) SCM_SMOB_DATA (image_smob);
564 area = image->width * image->height;
565 memset (image->pixels, 0, area);
566
567 /* Invoke the image's update function.
568 */
569 if (scm_is_true (image->update_func))
570 scm_call_0 (image->update_func);
571
572 scm_remember_upto_here_1 (image_smob);
573
574 return SCM_UNSPECIFIED;
575@}
576
577static SCM
578mark_image (SCM image_smob)
579@{
580 /* Mark the image's name and update function. */
581 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
582
583 scm_gc_mark (image->name);
584 return image->update_func;
585@}
586
587static size_t
588free_image (SCM image_smob)
589@{
590 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
591
45867c2a
NJ
592 scm_gc_free (image->pixels,
593 image->width * image->height,
594 "image pixels");
3229f68b
MV
595 scm_gc_free (image, sizeof (struct image), "image");
596
597 return 0;
598@}
599
600static int
601print_image (SCM image_smob, SCM port, scm_print_state *pstate)
602@{
603 struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
604
605 scm_puts ("#<image ", port);
606 scm_display (image->name, port);
607 scm_puts (">", port);
608
609 /* non-zero means success */
610 return 1;
611@}
612
613void
614init_image_type (void)
615@{
616 image_tag = scm_make_smob_type ("image", sizeof (struct image));
617 scm_set_smob_mark (image_tag, mark_image);
618 scm_set_smob_free (image_tag, free_image);
619 scm_set_smob_print (image_tag, print_image);
620
621 scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
622 scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
623@}
624@end example
625
626Here is a sample build and interaction with the code from the
627@file{example-smob} directory, on the author's machine:
628
629@example
630zwingli:example-smob$ make CC=gcc
097a793b
AW
631gcc `pkg-config --cflags guile-@value{EFFECTIVE-VERSION}` -c image-type.c -o image-type.o
632gcc `pkg-config --cflags guile-@value{EFFECTIVE-VERSION}` -c myguile.c -o myguile.o
633gcc image-type.o myguile.o `pkg-config --libs guile-@value{EFFECTIVE-VERSION}` -o myguile
3229f68b
MV
634zwingli:example-smob$ ./myguile
635guile> make-image
636#<primitive-procedure make-image>
637guile> (define i (make-image "Whistler's Mother" 100 100))
638guile> i
639#<image Whistler's Mother>
640guile> (clear-image i)
641guile> (clear-image 4)
642ERROR: In procedure clear-image in expression (clear-image 4):
48d8f659 643ERROR: Wrong type (expecting image): 4
3229f68b
MV
644ABORT: (wrong-type-arg)
645
646Type "(backtrace)" to get more information.
647guile>
648@end example