Commit | Line | Data |
---|---|---|
38a93523 NJ |
1 | @c essay \input texinfo |
2 | @c essay @c -*-texinfo-*- | |
3 | @c essay @c %**start of header | |
4 | @c essay @setfilename data-rep.info | |
5 | @c essay @settitle Data Representation in Guile | |
6 | @c essay @c %**end of header | |
7 | ||
8 | @c essay @include version.texi | |
9 | ||
10 | @c essay @dircategory The Algorithmic Language Scheme | |
11 | @c essay @direntry | |
12 | @c essay * data-rep: (data-rep). Data Representation in Guile --- how to use | |
12e5078c | 13 | @c essay Guile objects in your C code. |
38a93523 NJ |
14 | @c essay @end direntry |
15 | ||
16 | @c essay @setchapternewpage off | |
17 | ||
18 | @c essay @ifinfo | |
19 | @c essay Data Representation in Guile | |
20 | ||
21 | @c essay Copyright (C) 1998, 1999, 2000 Free Software Foundation | |
22 | ||
23 | @c essay Permission is granted to make and distribute verbatim copies of | |
24 | @c essay this manual provided the copyright notice and this permission notice | |
25 | @c essay are preserved on all copies. | |
26 | ||
27 | @c essay @ignore | |
28 | @c essay Permission is granted to process this file through TeX and print the | |
29 | @c essay results, provided the printed document carries copying permission | |
30 | @c essay notice identical to this one except for the removal of this paragraph | |
31 | @c essay (this paragraph not being relevant to the printed manual). | |
32 | @c essay @end ignore | |
33 | ||
34 | @c essay Permission is granted to copy and distribute modified versions of this | |
35 | @c essay manual under the conditions for verbatim copying, provided that the entire | |
36 | @c essay resulting derived work is distributed under the terms of a permission | |
37 | @c essay notice identical to this one. | |
38 | ||
39 | @c essay Permission is granted to copy and distribute translations of this manual | |
40 | @c essay into another language, under the above conditions for modified versions, | |
41 | @c essay except that this permission notice may be stated in a translation approved | |
42 | @c essay by the Free Software Foundation. | |
43 | @c essay @end ifinfo | |
44 | ||
45 | @c essay @titlepage | |
46 | @c essay @sp 10 | |
47 | @c essay @comment The title is printed in a large font. | |
48 | @c essay @title Data Representation in Guile | |
bcf009c3 | 49 | @c essay @subtitle $Id: data-rep.texi,v 1.8 2002-08-08 21:47:53 ossau Exp $ |
38a93523 NJ |
50 | @c essay @subtitle For use with Guile @value{VERSION} |
51 | @c essay @author Jim Blandy | |
52 | @c essay @author Free Software Foundation | |
53 | @c essay @author @email{jimb@@red-bean.com} | |
54 | @c essay @c The following two commands start the copyright page. | |
55 | @c essay @page | |
56 | @c essay @vskip 0pt plus 1filll | |
57 | @c essay @vskip 0pt plus 1filll | |
58 | @c essay Copyright @copyright{} 1998 Free Software Foundation | |
59 | ||
60 | @c essay Permission is granted to make and distribute verbatim copies of | |
61 | @c essay this manual provided the copyright notice and this permission notice | |
62 | @c essay are preserved on all copies. | |
63 | ||
64 | @c essay Permission is granted to copy and distribute modified versions of this | |
65 | @c essay manual under the conditions for verbatim copying, provided that the entire | |
66 | @c essay resulting derived work is distributed under the terms of a permission | |
67 | @c essay notice identical to this one. | |
68 | ||
69 | @c essay Permission is granted to copy and distribute translations of this manual | |
70 | @c essay into another language, under the above conditions for modified versions, | |
71 | @c essay except that this permission notice may be stated in a translation approved | |
72 | @c essay by Free Software Foundation. | |
73 | @c essay @end titlepage | |
74 | ||
75 | @c essay @c @smallbook | |
76 | @c essay @c @finalout | |
77 | @c essay @headings double | |
78 | ||
79 | ||
80 | @c essay @node Top, Data Representation in Scheme, (dir), (dir) | |
81 | @c essay @top Data Representation in Guile | |
82 | ||
83 | @c essay @ifinfo | |
84 | @c essay This essay is meant to provide the background necessary to read and | |
85 | @c essay write C code that manipulates Scheme values in a way that conforms to | |
86 | @c essay libguile's interface. If you would like to write or maintain a | |
87 | @c essay Guile-based application in C or C++, this is the first information you | |
88 | @c essay need. | |
89 | ||
90 | @c essay In order to make sense of Guile's @code{SCM_} functions, or read | |
91 | @c essay libguile's source code, it's essential to have a good grasp of how Guile | |
92 | @c essay actually represents Scheme values. Otherwise, a lot of the code, and | |
93 | @c essay the conventions it follows, won't make very much sense. | |
94 | ||
95 | @c essay We assume you know both C and Scheme, but we do not assume you are | |
96 | @c essay familiar with Guile's C interface. | |
97 | @c essay @end ifinfo | |
98 | ||
99 | ||
100 | @page | |
101 | @node Data Representation | |
102 | @chapter Data Representation in Guile | |
103 | ||
104 | @strong{by Jim Blandy} | |
105 | ||
106 | [Due to the rather non-orthogonal and performance-oriented nature of the | |
107 | SCM interface, you need to understand SCM internals *before* you can use | |
108 | the SCM API. That's why this chapter comes first.] | |
109 | ||
110 | [NOTE: this is Jim Blandy's essay almost entirely unmodified. It has to | |
111 | be adapted to fit this manual smoothly.] | |
112 | ||
113 | In order to make sense of Guile's SCM_ functions, or read libguile's | |
114 | source code, it's essential to have a good grasp of how Guile actually | |
115 | represents Scheme values. Otherwise, a lot of the code, and the | |
116 | conventions it follows, won't make very much sense. This essay is meant | |
117 | to provide the background necessary to read and write C code that | |
118 | manipulates Scheme values in a way that is compatible with libguile. | |
119 | ||
120 | We assume you know both C and Scheme, but we do not assume you are | |
121 | familiar with Guile's implementation. | |
122 | ||
123 | @menu | |
124 | * Data Representation in Scheme:: Why things aren't just totally | |
125 | straightforward, in general terms. | |
126 | * How Guile does it:: How to write C code that manipulates | |
127 | Guile values, with an explanation | |
128 | of Guile's garbage collector. | |
129 | * Defining New Types (Smobs):: How to extend Guile with your own | |
130 | application-specific datatypes. | |
131 | @end menu | |
132 | ||
133 | @node Data Representation in Scheme | |
134 | @section Data Representation in Scheme | |
135 | ||
136 | Scheme is a latently-typed language; this means that the system cannot, | |
137 | in general, determine the type of a given expression at compile time. | |
138 | Types only become apparent at run time. Variables do not have fixed | |
139 | types; a variable may hold a pair at one point, an integer at the next, | |
140 | and a thousand-element vector later. Instead, values, not variables, | |
141 | have fixed types. | |
142 | ||
143 | In order to implement standard Scheme functions like @code{pair?} and | |
144 | @code{string?} and provide garbage collection, the representation of | |
145 | every value must contain enough information to accurately determine its | |
146 | type at run time. Often, Scheme systems also use this information to | |
147 | determine whether a program has attempted to apply an operation to an | |
148 | inappropriately typed value (such as taking the @code{car} of a string). | |
149 | ||
150 | Because variables, pairs, and vectors may hold values of any type, | |
151 | Scheme implementations use a uniform representation for values --- a | |
152 | single type large enough to hold either a complete value or a pointer | |
153 | to a complete value, along with the necessary typing information. | |
154 | ||
155 | The following sections will present a simple typing system, and then | |
156 | make some refinements to correct its major weaknesses. However, this is | |
157 | not a description of the system Guile actually uses. It is only an | |
158 | illustration of the issues Guile's system must address. We provide all | |
159 | the information one needs to work with Guile's data in @ref{How Guile | |
160 | does it}. | |
161 | ||
162 | ||
163 | @menu | |
164 | * A Simple Representation:: | |
165 | * Faster Integers:: | |
166 | * Cheaper Pairs:: | |
167 | * Guile Is Hairier:: | |
168 | @end menu | |
169 | ||
170 | @node A Simple Representation | |
171 | @subsection A Simple Representation | |
172 | ||
173 | The simplest way to meet the above requirements in C would be to | |
174 | represent each value as a pointer to a structure containing a type | |
175 | indicator, followed by a union carrying the real value. Assuming that | |
176 | @code{SCM} is the name of our universal type, we can write: | |
177 | ||
178 | @example | |
179 | enum type @{ integer, pair, string, vector, ... @}; | |
180 | ||
181 | typedef struct value *SCM; | |
182 | ||
183 | struct value @{ | |
184 | enum type type; | |
185 | union @{ | |
186 | int integer; | |
187 | struct @{ SCM car, cdr; @} pair; | |
188 | struct @{ int length; char *elts; @} string; | |
189 | struct @{ int length; SCM *elts; @} vector; | |
190 | ... | |
191 | @} value; | |
192 | @}; | |
193 | @end example | |
194 | with the ellipses replaced with code for the remaining Scheme types. | |
195 | ||
196 | This representation is sufficient to implement all of Scheme's | |
197 | semantics. If @var{x} is an @code{SCM} value: | |
198 | @itemize @bullet | |
199 | @item | |
200 | To test if @var{x} is an integer, we can write @code{@var{x}->type == integer}. | |
201 | @item | |
202 | To find its value, we can write @code{@var{x}->value.integer}. | |
203 | @item | |
204 | To test if @var{x} is a vector, we can write @code{@var{x}->type == vector}. | |
205 | @item | |
206 | If we know @var{x} is a vector, we can write | |
207 | @code{@var{x}->value.vector.elts[0]} to refer to its first element. | |
208 | @item | |
209 | If we know @var{x} is a pair, we can write | |
210 | @code{@var{x}->value.pair.car} to extract its car. | |
211 | @end itemize | |
212 | ||
213 | ||
214 | @node Faster Integers | |
215 | @subsection Faster Integers | |
216 | ||
217 | Unfortunately, the above representation has a serious disadvantage. In | |
218 | order to return an integer, an expression must allocate a @code{struct | |
219 | value}, initialize it to represent that integer, and return a pointer to | |
220 | it. Furthermore, fetching an integer's value requires a memory | |
221 | reference, which is much slower than a register reference on most | |
222 | processors. Since integers are extremely common, this representation is | |
223 | too costly, in both time and space. Integers should be very cheap to | |
224 | create and manipulate. | |
225 | ||
226 | One possible solution comes from the observation that, on many | |
227 | architectures, structures must be aligned on a four-byte boundary. | |
228 | (Whether or not the machine actually requires it, we can write our own | |
229 | allocator for @code{struct value} objects that assures this is true.) | |
230 | In this case, the lower two bits of the structure's address are known to | |
231 | be zero. | |
232 | ||
233 | This gives us the room we need to provide an improved representation | |
234 | for integers. We make the following rules: | |
235 | @itemize @bullet | |
236 | @item | |
237 | If the lower two bits of an @code{SCM} value are zero, then the SCM | |
238 | value is a pointer to a @code{struct value}, and everything proceeds as | |
239 | before. | |
240 | @item | |
241 | Otherwise, the @code{SCM} value represents an integer, whose value | |
242 | appears in its upper bits. | |
243 | @end itemize | |
244 | ||
245 | Here is C code implementing this convention: | |
246 | @example | |
247 | enum type @{ pair, string, vector, ... @}; | |
248 | ||
249 | typedef struct value *SCM; | |
250 | ||
251 | struct value @{ | |
252 | enum type type; | |
253 | union @{ | |
254 | struct @{ SCM car, cdr; @} pair; | |
255 | struct @{ int length; char *elts; @} string; | |
256 | struct @{ int length; SCM *elts; @} vector; | |
257 | ... | |
258 | @} value; | |
259 | @}; | |
260 | ||
261 | #define POINTER_P(x) (((int) (x) & 3) == 0) | |
262 | #define INTEGER_P(x) (! POINTER_P (x)) | |
263 | ||
264 | #define GET_INTEGER(x) ((int) (x) >> 2) | |
265 | #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1)) | |
266 | @end example | |
267 | ||
268 | Notice that @code{integer} no longer appears as an element of @code{enum | |
269 | type}, and the union has lost its @code{integer} member. Instead, we | |
270 | use the @code{POINTER_P} and @code{INTEGER_P} macros to make a coarse | |
271 | classification of values into integers and non-integers, and do further | |
272 | type testing as before. | |
273 | ||
274 | Here's how we would answer the questions posed above (again, assume | |
275 | @var{x} is an @code{SCM} value): | |
276 | @itemize @bullet | |
277 | @item | |
278 | To test if @var{x} is an integer, we can write @code{INTEGER_P (@var{x})}. | |
279 | @item | |
280 | To find its value, we can write @code{GET_INTEGER (@var{x})}. | |
281 | @item | |
282 | To test if @var{x} is a vector, we can write: | |
283 | @example | |
284 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
285 | @end example | |
286 | Given the new representation, we must make sure @var{x} is truly a | |
287 | pointer before we dereference it to determine its complete type. | |
288 | @item | |
289 | If we know @var{x} is a vector, we can write | |
290 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
291 | before. | |
292 | @item | |
293 | If we know @var{x} is a pair, we can write | |
294 | @code{@var{x}->value.pair.car} to extract its car, just as before. | |
295 | @end itemize | |
296 | ||
297 | This representation allows us to operate more efficiently on integers | |
298 | than the first. For example, if @var{x} and @var{y} are known to be | |
299 | integers, we can compute their sum as follows: | |
300 | @example | |
301 | MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y})) | |
302 | @end example | |
303 | Now, integer math requires no allocation or memory references. Most | |
304 | real Scheme systems actually use an even more efficient representation, | |
305 | but this essay isn't about bit-twiddling. (Hint: what if pointers had | |
306 | @code{01} in their least significant bits, and integers had @code{00}?) | |
307 | ||
308 | ||
309 | @node Cheaper Pairs | |
310 | @subsection Cheaper Pairs | |
311 | ||
312 | However, there is yet another issue to confront. Most Scheme heaps | |
313 | contain more pairs than any other type of object; Jonathan Rees says | |
314 | that pairs occupy 45% of the heap in his Scheme implementation, Scheme | |
315 | 48. However, our representation above spends three @code{SCM}-sized | |
316 | words per pair --- one for the type, and two for the @sc{car} and | |
317 | @sc{cdr}. Is there any way to represent pairs using only two words? | |
318 | ||
319 | Let us refine the convention we established earlier. Let us assert | |
320 | that: | |
321 | @itemize @bullet | |
322 | @item | |
323 | If the bottom two bits of an @code{SCM} value are @code{#b00}, then | |
324 | it is a pointer, as before. | |
325 | @item | |
326 | If the bottom two bits are @code{#b01}, then the upper bits are an | |
327 | integer. This is a bit more restrictive than before. | |
328 | @item | |
329 | If the bottom two bits are @code{#b10}, then the value, with the bottom | |
330 | two bits masked out, is the address of a pair. | |
331 | @end itemize | |
332 | ||
333 | Here is the new C code: | |
334 | @example | |
335 | enum type @{ string, vector, ... @}; | |
336 | ||
337 | typedef struct value *SCM; | |
338 | ||
339 | struct value @{ | |
340 | enum type type; | |
341 | union @{ | |
342 | struct @{ int length; char *elts; @} string; | |
343 | struct @{ int length; SCM *elts; @} vector; | |
344 | ... | |
345 | @} value; | |
346 | @}; | |
347 | ||
348 | struct pair @{ | |
349 | SCM car, cdr; | |
350 | @}; | |
351 | ||
352 | #define POINTER_P(x) (((int) (x) & 3) == 0) | |
353 | ||
354 | #define INTEGER_P(x) (((int) (x) & 3) == 1) | |
355 | #define GET_INTEGER(x) ((int) (x) >> 2) | |
356 | #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1)) | |
357 | ||
358 | #define PAIR_P(x) (((int) (x) & 3) == 2) | |
359 | #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~3)) | |
360 | @end example | |
361 | ||
362 | Notice that @code{enum type} and @code{struct value} now only contain | |
363 | provisions for vectors and strings; both integers and pairs have become | |
364 | special cases. The code above also assumes that an @code{int} is large | |
365 | enough to hold a pointer, which isn't generally true. | |
366 | ||
367 | ||
368 | Our list of examples is now as follows: | |
369 | @itemize @bullet | |
370 | @item | |
371 | To test if @var{x} is an integer, we can write @code{INTEGER_P | |
372 | (@var{x})}; this is as before. | |
373 | @item | |
374 | To find its value, we can write @code{GET_INTEGER (@var{x})}, as | |
375 | before. | |
376 | @item | |
377 | To test if @var{x} is a vector, we can write: | |
378 | @example | |
379 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
380 | @end example | |
381 | We must still make sure that @var{x} is a pointer to a @code{struct | |
382 | value} before dereferencing it to find its type. | |
383 | @item | |
384 | If we know @var{x} is a vector, we can write | |
385 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
386 | before. | |
387 | @item | |
388 | We can write @code{PAIR_P (@var{x})} to determine if @var{x} is a | |
389 | pair, and then write @code{GET_PAIR (@var{x})->car} to refer to its | |
390 | car. | |
391 | @end itemize | |
392 | ||
393 | This change in representation reduces our heap size by 15%. It also | |
394 | makes it cheaper to decide if a value is a pair, because no memory | |
395 | references are necessary; it suffices to check the bottom two bits of | |
396 | the @code{SCM} value. This may be significant when traversing lists, a | |
397 | common activity in a Scheme system. | |
398 | ||
85a9b4ed | 399 | Again, most real Scheme systems use a slightly different implementation; |
38a93523 NJ |
400 | for example, if GET_PAIR subtracts off the low bits of @code{x}, instead |
401 | of masking them off, the optimizer will often be able to combine that | |
402 | subtraction with the addition of the offset of the structure member we | |
403 | are referencing, making a modified pointer as fast to use as an | |
404 | unmodified pointer. | |
405 | ||
406 | ||
407 | @node Guile Is Hairier | |
408 | @subsection Guile Is Hairier | |
409 | ||
410 | We originally started with a very simple typing system --- each object | |
411 | has a field that indicates its type. Then, for the sake of efficiency | |
412 | in both time and space, we moved some of the typing information directly | |
413 | into the @code{SCM} value, and left the rest in the @code{struct value}. | |
414 | Guile itself employs a more complex hierarchy, storing finer and finer | |
415 | gradations of type information in different places, depending on the | |
416 | object's coarser type. | |
417 | ||
418 | In the author's opinion, Guile could be simplified greatly without | |
419 | significant loss of efficiency, but the simplified system would still be | |
420 | more complex than what we've presented above. | |
421 | ||
422 | ||
423 | @node How Guile does it | |
424 | @section How Guile does it | |
425 | ||
426 | Here we present the specifics of how Guile represents its data. We | |
427 | don't go into complete detail; an exhaustive description of Guile's | |
428 | system would be boring, and we do not wish to encourage people to write | |
429 | code which depends on its details anyway. We do, however, present | |
430 | everything one need know to use Guile's data. | |
431 | ||
432 | ||
433 | @menu | |
434 | * General Rules:: | |
435 | * Conservative GC:: | |
abaec75d | 436 | * Immediates vs Non-immediates:: |
38a93523 NJ |
437 | * Immediate Datatypes:: |
438 | * Non-immediate Datatypes:: | |
439 | * Signalling Type Errors:: | |
505392ae | 440 | * Unpacking the SCM type:: |
38a93523 NJ |
441 | @end menu |
442 | ||
443 | @node General Rules | |
444 | @subsection General Rules | |
445 | ||
446 | Any code which operates on Guile datatypes must @code{#include} the | |
447 | header file @code{<libguile.h>}. This file contains a definition for | |
448 | the @code{SCM} typedef (Guile's universal type, as in the examples | |
449 | above), and definitions and declarations for a host of macros and | |
450 | functions that operate on @code{SCM} values. | |
451 | ||
452 | All identifiers declared by @code{<libguile.h>} begin with @code{scm_} | |
453 | or @code{SCM_}. | |
454 | ||
455 | @c [[I wish this were true, but I don't think it is at the moment. -JimB]] | |
456 | @c Macros do not evaluate their arguments more than once, unless documented | |
457 | @c to do so. | |
458 | ||
459 | The functions described here generally check the types of their | |
460 | @code{SCM} arguments, and signal an error if their arguments are of an | |
461 | inappropriate type. Macros generally do not, unless that is their | |
462 | specified purpose. You must verify their argument types beforehand, as | |
463 | necessary. | |
464 | ||
465 | Macros and functions that return a boolean value have names ending in | |
466 | @code{P} or @code{_p} (for ``predicate''). Those that return a negated | |
467 | boolean value have names starting with @code{SCM_N}. For example, | |
468 | @code{SCM_IMP (@var{x})} is a predicate which returns non-zero iff | |
469 | @var{x} is an immediate value (an @code{IM}). @code{SCM_NCONSP | |
470 | (@var{x})} is a predicate which returns non-zero iff @var{x} is | |
471 | @emph{not} a pair object (a @code{CONS}). | |
472 | ||
473 | ||
474 | @node Conservative GC | |
475 | @subsection Conservative Garbage Collection | |
476 | ||
477 | Aside from the latent typing, the major source of constraints on a | |
478 | Scheme implementation's data representation is the garbage collector. | |
479 | The collector must be able to traverse every live object in the heap, to | |
480 | determine which objects are not live. | |
481 | ||
482 | There are many ways to implement this, but Guile uses an algorithm | |
483 | called @dfn{mark and sweep}. The collector scans the system's global | |
484 | variables and the local variables on the stack to determine which | |
485 | objects are immediately accessible by the C code. It then scans those | |
486 | objects to find the objects they point to, @i{et cetera}. The collector | |
487 | sets a @dfn{mark bit} on each object it finds, so each object is | |
488 | traversed only once. This process is called @dfn{tracing}. | |
489 | ||
490 | When the collector can find no unmarked objects pointed to by marked | |
491 | objects, it assumes that any objects that are still unmarked will never | |
492 | be used by the program (since there is no path of dereferences from any | |
493 | global or local variable that reaches them) and deallocates them. | |
494 | ||
495 | In the above paragraphs, we did not specify how the garbage collector | |
496 | finds the global and local variables; as usual, there are many different | |
497 | approaches. Frequently, the programmer must maintain a list of pointers | |
498 | to all global variables that refer to the heap, and another list | |
499 | (adjusted upon entry to and exit from each function) of local variables, | |
500 | for the collector's benefit. | |
501 | ||
502 | The list of global variables is usually not too difficult to maintain, | |
503 | since global variables are relatively rare. However, an explicitly | |
504 | maintained list of local variables (in the author's personal experience) | |
505 | is a nightmare to maintain. Thus, Guile uses a technique called | |
506 | @dfn{conservative garbage collection}, to make the local variable list | |
507 | unnecessary. | |
508 | ||
509 | The trick to conservative collection is to treat the stack as an | |
510 | ordinary range of memory, and assume that @emph{every} word on the stack | |
511 | is a pointer into the heap. Thus, the collector marks all objects whose | |
512 | addresses appear anywhere in the stack, without knowing for sure how | |
513 | that word is meant to be interpreted. | |
514 | ||
515 | Obviously, such a system will occasionally retain objects that are | |
516 | actually garbage, and should be freed. In practice, this is not a | |
517 | problem. The alternative, an explicitly maintained list of local | |
518 | variable addresses, is effectively much less reliable, due to programmer | |
519 | error. | |
520 | ||
521 | To accommodate this technique, data must be represented so that the | |
522 | collector can accurately determine whether a given stack word is a | |
523 | pointer or not. Guile does this as follows: | |
38a93523 | 524 | |
505392ae | 525 | @itemize @bullet |
38a93523 NJ |
526 | @item |
527 | Every heap object has a two-word header, called a @dfn{cell}. Some | |
528 | objects, like pairs, fit entirely in a cell's two words; others may | |
529 | store pointers to additional memory in either of the words. For | |
530 | example, strings and vectors store their length in the first word, and a | |
531 | pointer to their elements in the second. | |
532 | ||
533 | @item | |
534 | Guile allocates whole arrays of cells at a time, called @dfn{heap | |
535 | segments}. These segments are always allocated so that the cells they | |
536 | contain fall on eight-byte boundaries, or whatever is appropriate for | |
537 | the machine's word size. Guile keeps all cells in a heap segment | |
538 | initialized, whether or not they are currently in use. | |
539 | ||
540 | @item | |
541 | Guile maintains a sorted table of heap segments. | |
38a93523 NJ |
542 | @end itemize |
543 | ||
544 | Thus, given any random word @var{w} fetched from the stack, Guile's | |
545 | garbage collector can consult the table to see if @var{w} falls within a | |
546 | known heap segment, and check @var{w}'s alignment. If both tests pass, | |
547 | the collector knows that @var{w} is a valid pointer to a cell, | |
548 | intentional or not, and proceeds to trace the cell. | |
549 | ||
550 | Note that heap segments do not contain all the data Guile uses; cells | |
551 | for objects like vectors and strings contain pointers to other memory | |
552 | areas. However, since those pointers are internal, and not shared among | |
553 | many pieces of code, it is enough for the collector to find the cell, | |
554 | and then use the cell's type to find more pointers to trace. | |
555 | ||
556 | ||
abaec75d NJ |
557 | @node Immediates vs Non-immediates |
558 | @subsection Immediates vs Non-immediates | |
38a93523 NJ |
559 | |
560 | Guile classifies Scheme objects into two kinds: those that fit entirely | |
561 | within an @code{SCM}, and those that require heap storage. | |
562 | ||
563 | The former class are called @dfn{immediates}. The class of immediates | |
564 | includes small integers, characters, boolean values, the empty list, the | |
565 | mysterious end-of-file object, and some others. | |
566 | ||
85a9b4ed | 567 | The remaining types are called, not surprisingly, @dfn{non-immediates}. |
38a93523 NJ |
568 | They include pairs, procedures, strings, vectors, and all other data |
569 | types in Guile. | |
570 | ||
571 | @deftypefn Macro int SCM_IMP (SCM @var{x}) | |
572 | Return non-zero iff @var{x} is an immediate object. | |
573 | @end deftypefn | |
574 | ||
575 | @deftypefn Macro int SCM_NIMP (SCM @var{x}) | |
576 | Return non-zero iff @var{x} is a non-immediate object. This is the | |
577 | exact complement of @code{SCM_IMP}, above. | |
38a93523 NJ |
578 | @end deftypefn |
579 | ||
ffda6093 | 580 | Note that for versions of Guile prior to 1.4 it was necessary to use the |
abaec75d NJ |
581 | @code{SCM_NIMP} macro before calling a finer-grained predicate to |
582 | determine @var{x}'s type, such as @code{SCM_CONSP} or | |
ffda6093 NJ |
583 | @code{SCM_VECTORP}. This is no longer required: the definitions of all |
584 | Guile type predicates now include a call to @code{SCM_NIMP} where | |
585 | necessary. | |
abaec75d | 586 | |
38a93523 NJ |
587 | |
588 | @node Immediate Datatypes | |
589 | @subsection Immediate Datatypes | |
590 | ||
591 | The following datatypes are immediate values; that is, they fit entirely | |
592 | within an @code{SCM} value. The @code{SCM_IMP} and @code{SCM_NIMP} | |
593 | macros will distinguish these from non-immediates; see @ref{Immediates | |
abaec75d | 594 | vs Non-immediates} for an explanation of the distinction. |
38a93523 NJ |
595 | |
596 | Note that the type predicates for immediate values work correctly on any | |
597 | @code{SCM} value; you do not need to call @code{SCM_IMP} first, to | |
505392ae | 598 | establish that a value is immediate. |
38a93523 NJ |
599 | |
600 | @menu | |
601 | * Integer Data:: | |
602 | * Character Data:: | |
603 | * Boolean Data:: | |
604 | * Unique Values:: | |
605 | @end menu | |
606 | ||
607 | @node Integer Data | |
608 | @subsubsection Integers | |
609 | ||
610 | Here are functions for operating on small integers, that fit within an | |
611 | @code{SCM}. Such integers are called @dfn{immediate numbers}, or | |
612 | @dfn{INUMs}. In general, INUMs occupy all but two bits of an | |
613 | @code{SCM}. | |
614 | ||
615 | Bignums and floating-point numbers are non-immediate objects, and have | |
616 | their own, separate accessors. The functions here will not work on | |
617 | them. This is not as much of a problem as you might think, however, | |
618 | because the system never constructs bignums that could fit in an INUM, | |
619 | and never uses floating point values for exact integers. | |
620 | ||
621 | @deftypefn Macro int SCM_INUMP (SCM @var{x}) | |
622 | Return non-zero iff @var{x} is a small integer value. | |
623 | @end deftypefn | |
624 | ||
625 | @deftypefn Macro int SCM_NINUMP (SCM @var{x}) | |
626 | The complement of SCM_INUMP. | |
627 | @end deftypefn | |
628 | ||
629 | @deftypefn Macro int SCM_INUM (SCM @var{x}) | |
630 | Return the value of @var{x} as an ordinary, C integer. If @var{x} | |
631 | is not an INUM, the result is undefined. | |
632 | @end deftypefn | |
633 | ||
634 | @deftypefn Macro SCM SCM_MAKINUM (int @var{i}) | |
635 | Given a C integer @var{i}, return its representation as an @code{SCM}. | |
636 | This function does not check for overflow. | |
637 | @end deftypefn | |
638 | ||
639 | ||
640 | @node Character Data | |
641 | @subsubsection Characters | |
642 | ||
643 | Here are functions for operating on characters. | |
644 | ||
645 | @deftypefn Macro int SCM_CHARP (SCM @var{x}) | |
646 | Return non-zero iff @var{x} is a character value. | |
647 | @end deftypefn | |
648 | ||
649 | @deftypefn Macro {unsigned int} SCM_CHAR (SCM @var{x}) | |
650 | Return the value of @code{x} as a C character. If @var{x} is not a | |
651 | Scheme character, the result is undefined. | |
652 | @end deftypefn | |
653 | ||
654 | @deftypefn Macro SCM SCM_MAKE_CHAR (int @var{c}) | |
655 | Given a C character @var{c}, return its representation as a Scheme | |
656 | character value. | |
657 | @end deftypefn | |
658 | ||
659 | ||
660 | @node Boolean Data | |
661 | @subsubsection Booleans | |
662 | ||
663 | Here are functions and macros for operating on booleans. | |
664 | ||
665 | @deftypefn Macro SCM SCM_BOOL_T | |
666 | @deftypefnx Macro SCM SCM_BOOL_F | |
667 | The Scheme true and false values. | |
668 | @end deftypefn | |
669 | ||
670 | @deftypefn Macro int SCM_NFALSEP (@var{x}) | |
671 | Convert the Scheme boolean value to a C boolean. Since every object in | |
672 | Scheme except @code{#f} is true, this amounts to comparing @var{x} to | |
673 | @code{#f}; hence the name. | |
674 | @c Noel feels a chill here. | |
675 | @end deftypefn | |
676 | ||
677 | @deftypefn Macro SCM SCM_BOOL_NOT (@var{x}) | |
678 | Return the boolean inverse of @var{x}. If @var{x} is not a | |
679 | Scheme boolean, the result is undefined. | |
680 | @end deftypefn | |
681 | ||
682 | ||
683 | @node Unique Values | |
684 | @subsubsection Unique Values | |
685 | ||
686 | The immediate values that are neither small integers, characters, nor | |
687 | booleans are all unique values --- that is, datatypes with only one | |
688 | instance. | |
689 | ||
690 | @deftypefn Macro SCM SCM_EOL | |
691 | The Scheme empty list object, or ``End Of List'' object, usually written | |
692 | in Scheme as @code{'()}. | |
693 | @end deftypefn | |
694 | ||
695 | @deftypefn Macro SCM SCM_EOF_VAL | |
696 | The Scheme end-of-file value. It has no standard written | |
697 | representation, for obvious reasons. | |
698 | @end deftypefn | |
699 | ||
700 | @deftypefn Macro SCM SCM_UNSPECIFIED | |
701 | The value returned by expressions which the Scheme standard says return | |
702 | an ``unspecified'' value. | |
703 | ||
704 | This is sort of a weirdly literal way to take things, but the standard | |
705 | read-eval-print loop prints nothing when the expression returns this | |
706 | value, so it's not a bad idea to return this when you can't think of | |
707 | anything else helpful. | |
708 | @end deftypefn | |
709 | ||
710 | @deftypefn Macro SCM SCM_UNDEFINED | |
711 | The ``undefined'' value. Its most important property is that is not | |
712 | equal to any valid Scheme value. This is put to various internal uses | |
713 | by C code interacting with Guile. | |
714 | ||
715 | For example, when you write a C function that is callable from Scheme | |
716 | and which takes optional arguments, the interpreter passes | |
717 | @code{SCM_UNDEFINED} for any arguments you did not receive. | |
718 | ||
719 | We also use this to mark unbound variables. | |
720 | @end deftypefn | |
721 | ||
722 | @deftypefn Macro int SCM_UNBNDP (SCM @var{x}) | |
723 | Return true if @var{x} is @code{SCM_UNDEFINED}. Apply this to a | |
724 | symbol's value to see if it has a binding as a global variable. | |
725 | @end deftypefn | |
726 | ||
727 | ||
728 | @node Non-immediate Datatypes | |
729 | @subsection Non-immediate Datatypes | |
730 | ||
731 | A non-immediate datatype is one which lives in the heap, either because | |
732 | it cannot fit entirely within a @code{SCM} word, or because it denotes a | |
cee2ed4f | 733 | specific storage location (in the nomenclature of the Revised^5 Report |
38a93523 NJ |
734 | on Scheme). |
735 | ||
736 | The @code{SCM_IMP} and @code{SCM_NIMP} macros will distinguish these | |
abaec75d | 737 | from immediates; see @ref{Immediates vs Non-immediates}. |
38a93523 NJ |
738 | |
739 | Given a cell, Guile distinguishes between pairs and other non-immediate | |
740 | types by storing special @dfn{tag} values in a non-pair cell's car, that | |
741 | cannot appear in normal pairs. A cell with a non-tag value in its car | |
742 | is an ordinary pair. The type of a cell with a tag in its car depends | |
743 | on the tag; the non-immediate type predicates test this value. If a tag | |
744 | value appears elsewhere (in a vector, for example), the heap may become | |
745 | corrupted. | |
746 | ||
505392ae NJ |
747 | Note how the type information for a non-immediate object is split |
748 | between the @code{SCM} word and the cell that the @code{SCM} word points | |
749 | to. The @code{SCM} word itself only indicates that the object is | |
750 | non-immediate --- in other words stored in a heap cell. The tag stored | |
751 | in the first word of the heap cell indicates more precisely the type of | |
752 | that object. | |
753 | ||
ffda6093 NJ |
754 | The type predicates for non-immediate values work correctly on any |
755 | @code{SCM} value; you do not need to call @code{SCM_NIMP} first, to | |
756 | establish that a value is non-immediate. | |
38a93523 NJ |
757 | |
758 | @menu | |
38a93523 NJ |
759 | * Pair Data:: |
760 | * Vector Data:: | |
761 | * Procedures:: | |
762 | * Closures:: | |
763 | * Subrs:: | |
764 | * Port Data:: | |
765 | @end menu | |
766 | ||
38a93523 NJ |
767 | |
768 | @node Pair Data | |
769 | @subsubsection Pairs | |
770 | ||
771 | Pairs are the essential building block of list structure in Scheme. A | |
772 | pair object has two fields, called the @dfn{car} and the @dfn{cdr}. | |
773 | ||
774 | It is conventional for a pair's @sc{car} to contain an element of a | |
775 | list, and the @sc{cdr} to point to the next pair in the list, or to | |
776 | contain @code{SCM_EOL}, indicating the end of the list. Thus, a set of | |
777 | pairs chained through their @sc{cdr}s constitutes a singly-linked list. | |
778 | Scheme and libguile define many functions which operate on lists | |
779 | constructed in this fashion, so although lists chained through the | |
780 | @sc{car}s of pairs will work fine too, they may be less convenient to | |
781 | manipulate, and receive less support from the community. | |
782 | ||
783 | Guile implements pairs by mapping the @sc{car} and @sc{cdr} of a pair | |
784 | directly into the two words of the cell. | |
785 | ||
786 | ||
787 | @deftypefn Macro int SCM_CONSP (SCM @var{x}) | |
788 | Return non-zero iff @var{x} is a Scheme pair object. | |
38a93523 NJ |
789 | @end deftypefn |
790 | ||
791 | @deftypefn Macro int SCM_NCONSP (SCM @var{x}) | |
792 | The complement of SCM_CONSP. | |
793 | @end deftypefn | |
794 | ||
38a93523 NJ |
795 | @deftypefun SCM scm_cons (SCM @var{car}, SCM @var{cdr}) |
796 | Allocate (``CONStruct'') a new pair, with @var{car} and @var{cdr} as its | |
797 | contents. | |
798 | @end deftypefun | |
799 | ||
85a9b4ed | 800 | The macros below perform no type checking. The results are undefined if |
38a93523 NJ |
801 | @var{cell} is an immediate. However, since all non-immediate Guile |
802 | objects are constructed from cells, and these macros simply return the | |
803 | first element of a cell, they actually can be useful on datatypes other | |
804 | than pairs. (Of course, it is not very modular to use them outside of | |
805 | the code which implements that datatype.) | |
806 | ||
807 | @deftypefn Macro SCM SCM_CAR (SCM @var{cell}) | |
808 | Return the @sc{car}, or first field, of @var{cell}. | |
809 | @end deftypefn | |
810 | ||
811 | @deftypefn Macro SCM SCM_CDR (SCM @var{cell}) | |
812 | Return the @sc{cdr}, or second field, of @var{cell}. | |
813 | @end deftypefn | |
814 | ||
815 | @deftypefn Macro void SCM_SETCAR (SCM @var{cell}, SCM @var{x}) | |
816 | Set the @sc{car} of @var{cell} to @var{x}. | |
817 | @end deftypefn | |
818 | ||
819 | @deftypefn Macro void SCM_SETCDR (SCM @var{cell}, SCM @var{x}) | |
820 | Set the @sc{cdr} of @var{cell} to @var{x}. | |
821 | @end deftypefn | |
822 | ||
823 | @deftypefn Macro SCM SCM_CAAR (SCM @var{cell}) | |
824 | @deftypefnx Macro SCM SCM_CADR (SCM @var{cell}) | |
825 | @deftypefnx Macro SCM SCM_CDAR (SCM @var{cell}) @dots{} | |
826 | @deftypefnx Macro SCM SCM_CDDDDR (SCM @var{cell}) | |
827 | Return the @sc{car} of the @sc{car} of @var{cell}, the @sc{car} of the | |
828 | @sc{cdr} of @var{cell}, @i{et cetera}. | |
829 | @end deftypefn | |
830 | ||
831 | ||
832 | @node Vector Data | |
833 | @subsubsection Vectors, Strings, and Symbols | |
834 | ||
835 | Vectors, strings, and symbols have some properties in common. They all | |
836 | have a length, and they all have an array of elements. In the case of a | |
837 | vector, the elements are @code{SCM} values; in the case of a string or | |
838 | symbol, the elements are characters. | |
839 | ||
840 | All these types store their length (along with some tagging bits) in the | |
841 | @sc{car} of their header cell, and store a pointer to the elements in | |
842 | their @sc{cdr}. Thus, the @code{SCM_CAR} and @code{SCM_CDR} macros | |
843 | are (somewhat) meaningful when applied to these datatypes. | |
844 | ||
845 | @deftypefn Macro int SCM_VECTORP (SCM @var{x}) | |
846 | Return non-zero iff @var{x} is a vector. | |
38a93523 NJ |
847 | @end deftypefn |
848 | ||
849 | @deftypefn Macro int SCM_STRINGP (SCM @var{x}) | |
850 | Return non-zero iff @var{x} is a string. | |
38a93523 NJ |
851 | @end deftypefn |
852 | ||
853 | @deftypefn Macro int SCM_SYMBOLP (SCM @var{x}) | |
854 | Return non-zero iff @var{x} is a symbol. | |
38a93523 NJ |
855 | @end deftypefn |
856 | ||
cee2ed4f MG |
857 | @deftypefn Macro int SCM_VECTOR_LENGTH (SCM @var{x}) |
858 | @deftypefnx Macro int SCM_STRING_LENGTH (SCM @var{x}) | |
859 | @deftypefnx Macro int SCM_SYMBOL_LENGTH (SCM @var{x}) | |
860 | Return the length of the object @var{x}. The result is undefined if | |
861 | @var{x} is not a vector, string, or symbol, respectively. | |
38a93523 NJ |
862 | @end deftypefn |
863 | ||
cee2ed4f | 864 | @deftypefn Macro {SCM *} SCM_VECTOR_BASE (SCM @var{x}) |
38a93523 | 865 | Return a pointer to the array of elements of the vector @var{x}. |
505392ae | 866 | The result is undefined if @var{x} is not a vector. |
38a93523 NJ |
867 | @end deftypefn |
868 | ||
cee2ed4f MG |
869 | @deftypefn Macro {char *} SCM_STRING_CHARS (SCM @var{x}) |
870 | @deftypefnx Macro {char *} SCM_SYMBOL_CHARS (SCM @var{x}) | |
871 | Return a pointer to the characters of @var{x}. The result is undefined | |
872 | if @var{x} is not a symbol or string, respectively. | |
38a93523 NJ |
873 | @end deftypefn |
874 | ||
875 | There are also a few magic values stuffed into memory before a symbol's | |
876 | characters, but you don't want to know about those. What cruft! | |
877 | ||
878 | ||
879 | @node Procedures | |
880 | @subsubsection Procedures | |
881 | ||
882 | Guile provides two kinds of procedures: @dfn{closures}, which are the | |
883 | result of evaluating a @code{lambda} expression, and @dfn{subrs}, which | |
884 | are C functions packaged up as Scheme objects, to make them available to | |
885 | Scheme programmers. | |
886 | ||
887 | (There are actually other sorts of procedures: compiled closures, and | |
888 | continuations; see the source code for details about them.) | |
889 | ||
890 | @deftypefun SCM scm_procedure_p (SCM @var{x}) | |
891 | Return @code{SCM_BOOL_T} iff @var{x} is a Scheme procedure object, of | |
892 | any sort. Otherwise, return @code{SCM_BOOL_F}. | |
893 | @end deftypefun | |
894 | ||
895 | ||
896 | @node Closures | |
897 | @subsubsection Closures | |
898 | ||
899 | [FIXME: this needs to be further subbed, but texinfo has no subsubsub] | |
900 | ||
901 | A closure is a procedure object, generated as the value of a | |
902 | @code{lambda} expression in Scheme. The representation of a closure is | |
903 | straightforward --- it contains a pointer to the code of the lambda | |
904 | expression from which it was created, and a pointer to the environment | |
905 | it closes over. | |
906 | ||
907 | In Guile, each closure also has a property list, allowing the system to | |
908 | store information about the closure. I'm not sure what this is used for | |
909 | at the moment --- the debugger, maybe? | |
910 | ||
911 | @deftypefn Macro int SCM_CLOSUREP (SCM @var{x}) | |
505392ae | 912 | Return non-zero iff @var{x} is a closure. |
38a93523 NJ |
913 | @end deftypefn |
914 | ||
915 | @deftypefn Macro SCM SCM_PROCPROPS (SCM @var{x}) | |
916 | Return the property list of the closure @var{x}. The results are | |
917 | undefined if @var{x} is not a closure. | |
918 | @end deftypefn | |
919 | ||
920 | @deftypefn Macro void SCM_SETPROCPROPS (SCM @var{x}, SCM @var{p}) | |
921 | Set the property list of the closure @var{x} to @var{p}. The results | |
922 | are undefined if @var{x} is not a closure. | |
923 | @end deftypefn | |
924 | ||
925 | @deftypefn Macro SCM SCM_CODE (SCM @var{x}) | |
505392ae | 926 | Return the code of the closure @var{x}. The result is undefined if |
38a93523 NJ |
927 | @var{x} is not a closure. |
928 | ||
929 | This function should probably only be used internally by the | |
930 | interpreter, since the representation of the code is intimately | |
931 | connected with the interpreter's implementation. | |
932 | @end deftypefn | |
933 | ||
934 | @deftypefn Macro SCM SCM_ENV (SCM @var{x}) | |
935 | Return the environment enclosed by @var{x}. | |
505392ae | 936 | The result is undefined if @var{x} is not a closure. |
38a93523 NJ |
937 | |
938 | This function should probably only be used internally by the | |
939 | interpreter, since the representation of the environment is intimately | |
940 | connected with the interpreter's implementation. | |
941 | @end deftypefn | |
942 | ||
943 | ||
944 | @node Subrs | |
945 | @subsubsection Subrs | |
946 | ||
947 | [FIXME: this needs to be further subbed, but texinfo has no subsubsub] | |
948 | ||
949 | A subr is a pointer to a C function, packaged up as a Scheme object to | |
950 | make it callable by Scheme code. In addition to the function pointer, | |
951 | the subr also contains a pointer to the name of the function, and | |
85a9b4ed | 952 | information about the number of arguments accepted by the C function, for |
38a93523 NJ |
953 | the sake of error checking. |
954 | ||
955 | There is no single type predicate macro that recognizes subrs, as | |
956 | distinct from other kinds of procedures. The closest thing is | |
957 | @code{scm_procedure_p}; see @ref{Procedures}. | |
958 | ||
959 | @deftypefn Macro {char *} SCM_SNAME (@var{x}) | |
505392ae | 960 | Return the name of the subr @var{x}. The result is undefined if |
38a93523 NJ |
961 | @var{x} is not a subr. |
962 | @end deftypefn | |
963 | ||
bcf009c3 | 964 | @deftypefun SCM scm_c_define_gsubr (char *@var{name}, int @var{req}, int @var{opt}, int @var{rest}, SCM (*@var{function})()) |
38a93523 NJ |
965 | Create a new subr object named @var{name}, based on the C function |
966 | @var{function}, make it visible to Scheme the value of as a global | |
967 | variable named @var{name}, and return the subr object. | |
968 | ||
969 | The subr object accepts @var{req} required arguments, @var{opt} optional | |
970 | arguments, and a @var{rest} argument iff @var{rest} is non-zero. The C | |
971 | function @var{function} should accept @code{@var{req} + @var{opt}} | |
972 | arguments, or @code{@var{req} + @var{opt} + 1} arguments if @code{rest} | |
973 | is non-zero. | |
974 | ||
975 | When a subr object is applied, it must be applied to at least @var{req} | |
976 | arguments, or else Guile signals an error. @var{function} receives the | |
977 | subr's first @var{req} arguments as its first @var{req} arguments. If | |
978 | there are fewer than @var{opt} arguments remaining, then @var{function} | |
979 | receives the value @code{SCM_UNDEFINED} for any missing optional | |
980 | arguments. If @var{rst} is non-zero, then any arguments after the first | |
981 | @code{@var{req} + @var{opt}} are packaged up as a list as passed as | |
982 | @var{function}'s last argument. | |
983 | ||
984 | Note that subrs can actually only accept a predefined set of | |
985 | combinations of required, optional, and rest arguments. For example, a | |
986 | subr can take one required argument, or one required and one optional | |
987 | argument, but a subr can't take one required and two optional arguments. | |
988 | It's bizarre, but that's the way the interpreter was written. If the | |
bcf009c3 NJ |
989 | arguments to @code{scm_c_define_gsubr} do not fit one of the predefined |
990 | patterns, then @code{scm_c_define_gsubr} will return a compiled closure | |
38a93523 NJ |
991 | object instead of a subr object. |
992 | @end deftypefun | |
993 | ||
994 | ||
995 | @node Port Data | |
996 | @subsubsection Ports | |
997 | ||
998 | Haven't written this yet, 'cos I don't understand ports yet. | |
999 | ||
1000 | ||
1001 | @node Signalling Type Errors | |
1002 | @subsection Signalling Type Errors | |
1003 | ||
1004 | Every function visible at the Scheme level should aggressively check the | |
1005 | types of its arguments, to avoid misinterpreting a value, and perhaps | |
1006 | causing a segmentation fault. Guile provides some macros to make this | |
1007 | easier. | |
1008 | ||
813c57db NJ |
1009 | @deftypefn Macro void SCM_ASSERT (int @var{test}, SCM @var{obj}, unsigned int @var{position}, const char *@var{subr}) |
1010 | If @var{test} is zero, signal a ``wrong type argument'' error, | |
1011 | attributed to the subroutine named @var{subr}, operating on the value | |
1012 | @var{obj}, which is the @var{position}'th argument of @var{subr}. | |
38a93523 NJ |
1013 | @end deftypefn |
1014 | ||
1015 | @deftypefn Macro int SCM_ARG1 | |
1016 | @deftypefnx Macro int SCM_ARG2 | |
1017 | @deftypefnx Macro int SCM_ARG3 | |
1018 | @deftypefnx Macro int SCM_ARG4 | |
1019 | @deftypefnx Macro int SCM_ARG5 | |
813c57db NJ |
1020 | @deftypefnx Macro int SCM_ARG6 |
1021 | @deftypefnx Macro int SCM_ARG7 | |
1022 | One of the above values can be used for @var{position} to indicate the | |
1023 | number of the argument of @var{subr} which is being checked. | |
1024 | Alternatively, a positive integer number can be used, which allows to | |
1025 | check arguments after the seventh. However, for parameter numbers up to | |
1026 | seven it is preferable to use @code{SCM_ARGN} instead of the | |
1027 | corresponding raw number, since it will make the code easier to | |
1028 | understand. | |
38a93523 NJ |
1029 | @end deftypefn |
1030 | ||
1031 | @deftypefn Macro int SCM_ARGn | |
813c57db NJ |
1032 | Passing a value of zero or @code{SCM_ARGn} for @var{position} allows to |
1033 | leave it unspecified which argument's type is incorrect. Again, | |
1034 | @code{SCM_ARGn} should be preferred over a raw zero constant. | |
38a93523 NJ |
1035 | @end deftypefn |
1036 | ||
1037 | ||
505392ae NJ |
1038 | @node Unpacking the SCM type |
1039 | @subsection Unpacking the SCM Type | |
1040 | ||
1041 | The previous sections have explained how @code{SCM} values can refer to | |
1042 | immediate and non-immediate Scheme objects. For immediate objects, the | |
1043 | complete object value is stored in the @code{SCM} word itself, while for | |
1044 | non-immediates, the @code{SCM} word contains a pointer to a heap cell, | |
1045 | and further information about the object in question is stored in that | |
1046 | cell. This section describes how the @code{SCM} type is actually | |
1047 | represented and used at the C level. | |
1048 | ||
1049 | In fact, there are two basic C data types to represent objects in Guile: | |
1050 | ||
bcf009c3 | 1051 | @deftp {Data type} SCM |
505392ae NJ |
1052 | @code{SCM} is the user level abstract C type that is used to represent |
1053 | all of Guile's Scheme objects, no matter what the Scheme object type is. | |
1054 | No C operation except assignment is guaranteed to work with variables of | |
1055 | type @code{SCM}, so you should only use macros and functions to work | |
1056 | with @code{SCM} values. Values are converted between C data types and | |
1057 | the @code{SCM} type with utility functions and macros. | |
bcf009c3 NJ |
1058 | @end deftp |
1059 | @cindex SCM data type | |
505392ae | 1060 | |
bcf009c3 | 1061 | @deftp {Data type} scm_t_bits |
9d5315b6 | 1062 | @code{scm_t_bits} is an integral data type that is guaranteed to be |
505392ae NJ |
1063 | large enough to hold all information that is required to represent any |
1064 | Scheme object. While this data type is mostly used to implement Guile's | |
1065 | internals, the use of this type is also necessary to write certain kinds | |
1066 | of extensions to Guile. | |
bcf009c3 | 1067 | @end deftp |
505392ae NJ |
1068 | |
1069 | @menu | |
9d5315b6 | 1070 | * Relationship between SCM and scm_t_bits:: |
505392ae NJ |
1071 | * Immediate objects:: |
1072 | * Non-immediate objects:: | |
9d5315b6 | 1073 | * Allocating Cells:: |
505392ae NJ |
1074 | * Heap Cell Type Information:: |
1075 | * Accessing Cell Entries:: | |
1076 | * Basic Rules for Accessing Cell Entries:: | |
1077 | @end menu | |
1078 | ||
1079 | ||
9d5315b6 MV |
1080 | @node Relationship between SCM and scm_t_bits |
1081 | @subsubsection Relationship between @code{SCM} and @code{scm_t_bits} | |
505392ae NJ |
1082 | |
1083 | A variable of type @code{SCM} is guaranteed to hold a valid Scheme | |
9d5315b6 | 1084 | object. A variable of type @code{scm_t_bits}, on the other hand, may |
505392ae NJ |
1085 | hold a representation of a @code{SCM} value as a C integral type, but |
1086 | may also hold any C value, even if it does not correspond to a valid | |
1087 | Scheme object. | |
1088 | ||
1089 | For a variable @var{x} of type @code{SCM}, the Scheme object's type | |
1090 | information is stored in a form that is not directly usable. To be able | |
1091 | to work on the type encoding of the scheme value, the @code{SCM} | |
1092 | variable has to be transformed into the corresponding representation as | |
9d5315b6 | 1093 | a @code{scm_t_bits} variable @var{y} by using the @code{SCM_UNPACK} |
505392ae | 1094 | macro. Once this has been done, the type of the scheme object @var{x} |
9d5315b6 | 1095 | can be derived from the content of the bits of the @code{scm_t_bits} |
505392ae NJ |
1096 | value @var{y}, in the way illustrated by the example earlier in this |
1097 | chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a | |
9d5315b6 | 1098 | Scheme value as a @code{scm_t_bits} variable can be transformed into the |
505392ae NJ |
1099 | corresponding @code{SCM} value using the @code{SCM_PACK} macro. |
1100 | ||
9d5315b6 | 1101 | @deftypefn Macro scm_t_bits SCM_UNPACK (SCM @var{x}) |
505392ae NJ |
1102 | Transforms the @code{SCM} value @var{x} into its representation as an |
1103 | integral type. Only after applying @code{SCM_UNPACK} it is possible to | |
1104 | access the bits and contents of the @code{SCM} value. | |
1105 | @end deftypefn | |
1106 | ||
9d5315b6 | 1107 | @deftypefn Macro SCM SCM_PACK (scm_t_bits @var{x}) |
505392ae NJ |
1108 | Takes a valid integral representation of a Scheme object and transforms |
1109 | it into its representation as a @code{SCM} value. | |
1110 | @end deftypefn | |
1111 | ||
1112 | ||
1113 | @node Immediate objects | |
1114 | @subsubsection Immediate objects | |
1115 | ||
1116 | A Scheme object may either be an immediate, i.e. carrying all necessary | |
1117 | information by itself, or it may contain a reference to a @dfn{cell} | |
1118 | with additional information on the heap. Although in general it should | |
1119 | be irrelevant for user code whether an object is an immediate or not, | |
1120 | within Guile's own code the distinction is sometimes of importance. | |
1121 | Thus, the following low level macro is provided: | |
1122 | ||
1123 | @deftypefn Macro int SCM_IMP (SCM @var{x}) | |
1124 | A Scheme object is an immediate if it fulfills the @code{SCM_IMP} | |
1125 | predicate, otherwise it holds an encoded reference to a heap cell. The | |
1126 | result of the predicate is delivered as a C style boolean value. User | |
1127 | code and code that extends Guile should normally not be required to use | |
1128 | this macro. | |
1129 | @end deftypefn | |
1130 | ||
1131 | @noindent | |
1132 | Summary: | |
1133 | @itemize @bullet | |
1134 | @item | |
1135 | Given a Scheme object @var{x} of unknown type, check first | |
1136 | with @code{SCM_IMP (@var{x})} if it is an immediate object. | |
1137 | @item | |
1138 | If so, all of the type and value information can be determined from the | |
9d5315b6 | 1139 | @code{scm_t_bits} value that is delivered by @code{SCM_UNPACK |
505392ae NJ |
1140 | (@var{x})}. |
1141 | @end itemize | |
1142 | ||
1143 | ||
1144 | @node Non-immediate objects | |
1145 | @subsubsection Non-immediate objects | |
1146 | ||
85a9b4ed | 1147 | A Scheme object of type @code{SCM} that does not fulfill the |
505392ae NJ |
1148 | @code{SCM_IMP} predicate holds an encoded reference to a heap cell. |
1149 | This reference can be decoded to a C pointer to a heap cell using the | |
1150 | @code{SCM2PTR} macro. The encoding of a pointer to a heap cell into a | |
1151 | @code{SCM} value is done using the @code{PTR2SCM} macro. | |
1152 | ||
1153 | @c (FIXME:: this name should be changed) | |
228a24ef | 1154 | @deftypefn Macro (scm_t_cell *) SCM2PTR (SCM @var{x}) |
505392ae NJ |
1155 | Extract and return the heap cell pointer from a non-immediate @code{SCM} |
1156 | object @var{x}. | |
1157 | @end deftypefn | |
1158 | ||
1159 | @c (FIXME:: this name should be changed) | |
228a24ef | 1160 | @deftypefn Macro SCM PTR2SCM (scm_t_cell * @var{x}) |
505392ae NJ |
1161 | Return a @code{SCM} value that encodes a reference to the heap cell |
1162 | pointer @var{x}. | |
1163 | @end deftypefn | |
1164 | ||
1165 | Note that it is also possible to transform a non-immediate @code{SCM} | |
9d5315b6 | 1166 | value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable. |
505392ae | 1167 | However, the result of @code{SCM_UNPACK} may not be used as a pointer to |
228a24ef | 1168 | a @code{scm_t_cell}: only @code{SCM2PTR} is guaranteed to transform a |
505392ae NJ |
1169 | @code{SCM} object into a valid pointer to a heap cell. Also, it is not |
1170 | allowed to apply @code{PTR2SCM} to anything that is not a valid pointer | |
1171 | to a heap cell. | |
1172 | ||
1173 | @noindent | |
1174 | Summary: | |
1175 | @itemize @bullet | |
1176 | @item | |
1177 | Only use @code{SCM2PTR} on @code{SCM} values for which @code{SCM_IMP} is | |
1178 | false! | |
1179 | @item | |
228a24ef | 1180 | Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use @code{SCM2PTR |
505392ae NJ |
1181 | (@var{x})} instead! |
1182 | @item | |
1183 | Don't use @code{PTR2SCM} for anything but a cell pointer! | |
1184 | @end itemize | |
1185 | ||
9d5315b6 MV |
1186 | @node Allocating Cells |
1187 | @subsubsection Allocating Cells | |
1188 | ||
1189 | Guile provides both ordinary cells with two slots, and double cells | |
1190 | with four slots. The following two function are the most primitive | |
1191 | way to allocate such cells. | |
1192 | ||
1193 | If the caller intends to use it as a header for some other type, she | |
1194 | must pass an appropriate magic value in @var{word_0}, to mark it as a | |
1195 | member of that type, and pass whatever value as @var{word_1}, etc that | |
1196 | the type expects. You should generally not need these functions, | |
1197 | unless you are implementing a new datatype, and thoroughly understand | |
1198 | the code in @code{<libguile/tags.h>}. | |
1199 | ||
1200 | If you just want to allocate pairs, use @code{scm_cons}. | |
1201 | ||
228a24ef | 1202 | @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1) |
9d5315b6 MV |
1203 | Allocate a new cell, initialize the two slots with @var{word_0} and |
1204 | @var{word_1}, and return it. | |
1205 | ||
1206 | Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}. | |
1207 | If you want to pass a @code{SCM} object, you need to use | |
1208 | @code{SCM_UNPACK}. | |
1209 | @end deftypefn | |
1210 | ||
228a24ef DH |
1211 | @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3) |
1212 | Like @code{scm_cell}, but allocates a double cell with four | |
9d5315b6 MV |
1213 | slots. |
1214 | @end deftypefn | |
505392ae NJ |
1215 | |
1216 | @node Heap Cell Type Information | |
1217 | @subsubsection Heap Cell Type Information | |
1218 | ||
1219 | Heap cells contain a number of entries, each of which is either a scheme | |
9d5315b6 | 1220 | object of type @code{SCM} or a raw C value of type @code{scm_t_bits}. |
505392ae NJ |
1221 | Which of the cell entries contain Scheme objects and which contain raw C |
1222 | values is determined by the first entry of the cell, which holds the | |
1223 | cell type information. | |
1224 | ||
9d5315b6 | 1225 | @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x}) |
505392ae NJ |
1226 | For a non-immediate Scheme object @var{x}, deliver the content of the |
1227 | first entry of the heap cell referenced by @var{x}. This value holds | |
1228 | the information about the cell type. | |
1229 | @end deftypefn | |
1230 | ||
9d5315b6 | 1231 | @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t}) |
505392ae NJ |
1232 | For a non-immediate Scheme object @var{x}, write the value @var{t} into |
1233 | the first entry of the heap cell referenced by @var{x}. The value | |
1234 | @var{t} must hold a valid cell type. | |
1235 | @end deftypefn | |
1236 | ||
1237 | ||
1238 | @node Accessing Cell Entries | |
1239 | @subsubsection Accessing Cell Entries | |
1240 | ||
1241 | For a non-immediate Scheme object @var{x}, the object type can be | |
1242 | determined by reading the cell type entry using the @code{SCM_CELL_TYPE} | |
1243 | macro. For each different type of cell it is known which cell entries | |
1244 | hold Scheme objects and which cell entries hold raw C data. To access | |
1245 | the different cell entries appropriately, the following macros are | |
1246 | provided. | |
1247 | ||
9d5315b6 | 1248 | @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n}) |
505392ae NJ |
1249 | Deliver the cell entry @var{n} of the heap cell referenced by the |
1250 | non-immediate Scheme object @var{x} as raw data. It is illegal, to | |
1251 | access cell entries that hold Scheme objects by using these macros. For | |
1252 | convenience, the following macros are also provided. | |
230712c9 | 1253 | @itemize @bullet |
505392ae NJ |
1254 | @item |
1255 | SCM_CELL_WORD_0 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 0) | |
1256 | @item | |
1257 | SCM_CELL_WORD_1 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 1) | |
1258 | @item | |
1259 | @dots{} | |
1260 | @item | |
1261 | SCM_CELL_WORD_@var{n} (@var{x}) @result{} SCM_CELL_WORD (@var{x}, @var{n}) | |
1262 | @end itemize | |
1263 | @end deftypefn | |
1264 | ||
1265 | @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}) | |
1266 | Deliver the cell entry @var{n} of the heap cell referenced by the | |
1267 | non-immediate Scheme object @var{x} as a Scheme object. It is illegal, | |
1268 | to access cell entries that do not hold Scheme objects by using these | |
1269 | macros. For convenience, the following macros are also provided. | |
230712c9 | 1270 | @itemize @bullet |
505392ae NJ |
1271 | @item |
1272 | SCM_CELL_OBJECT_0 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 0) | |
1273 | @item | |
1274 | SCM_CELL_OBJECT_1 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 1) | |
1275 | @item | |
1276 | @dots{} | |
1277 | @item | |
1278 | SCM_CELL_OBJECT_@var{n} (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, | |
1279 | @var{n}) | |
1280 | @end itemize | |
1281 | @end deftypefn | |
1282 | ||
9d5315b6 | 1283 | @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w}) |
505392ae NJ |
1284 | Write the raw C value @var{w} into entry number @var{n} of the heap cell |
1285 | referenced by the non-immediate Scheme value @var{x}. Values that are | |
1286 | written into cells this way may only be read from the cells using the | |
1287 | @code{SCM_CELL_WORD} macros or, in case cell entry 0 is written, using | |
1288 | the @code{SCM_CELL_TYPE} macro. For the special case of cell entry 0 it | |
1289 | has to be made sure that @var{w} contains a cell type information which | |
1290 | does not describe a Scheme object. For convenience, the following | |
1291 | macros are also provided. | |
230712c9 | 1292 | @itemize @bullet |
505392ae NJ |
1293 | @item |
1294 | SCM_SET_CELL_WORD_0 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1295 | (@var{x}, 0, @var{w}) | |
1296 | @item | |
1297 | SCM_SET_CELL_WORD_1 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1298 | (@var{x}, 1, @var{w}) | |
1299 | @item | |
1300 | @dots{} | |
1301 | @item | |
1302 | SCM_SET_CELL_WORD_@var{n} (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1303 | (@var{x}, @var{n}, @var{w}) | |
1304 | @end itemize | |
1305 | @end deftypefn | |
1306 | ||
1307 | @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o}) | |
1308 | Write the Scheme object @var{o} into entry number @var{n} of the heap | |
1309 | cell referenced by the non-immediate Scheme value @var{x}. Values that | |
1310 | are written into cells this way may only be read from the cells using | |
1311 | the @code{SCM_CELL_OBJECT} macros or, in case cell entry 0 is written, | |
1312 | using the @code{SCM_CELL_TYPE} macro. For the special case of cell | |
1313 | entry 0 the writing of a Scheme object into this cell is only allowed | |
1314 | if the cell forms a Scheme pair. For convenience, the following macros | |
1315 | are also provided. | |
230712c9 | 1316 | @itemize @bullet |
505392ae NJ |
1317 | @item |
1318 | SCM_SET_CELL_OBJECT_0 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
1319 | (@var{x}, 0, @var{o}) | |
1320 | @item | |
1321 | SCM_SET_CELL_OBJECT_1 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
1322 | (@var{x}, 1, @var{o}) | |
1323 | @item | |
1324 | @dots{} | |
1325 | @item | |
1326 | SCM_SET_CELL_OBJECT_@var{n} (@var{x}, @var{o}) @result{} | |
1327 | SCM_SET_CELL_OBJECT (@var{x}, @var{n}, @var{o}) | |
1328 | @end itemize | |
1329 | @end deftypefn | |
1330 | ||
1331 | @noindent | |
1332 | Summary: | |
1333 | @itemize @bullet | |
1334 | @item | |
1335 | For a non-immediate Scheme object @var{x} of unknown type, get the type | |
1336 | information by using @code{SCM_CELL_TYPE (@var{x})}. | |
1337 | @item | |
1338 | As soon as the cell type information is available, only use the | |
1339 | appropriate access methods to read and write data to the different cell | |
1340 | entries. | |
1341 | @end itemize | |
1342 | ||
1343 | ||
1344 | @node Basic Rules for Accessing Cell Entries | |
1345 | @subsubsection Basic Rules for Accessing Cell Entries | |
1346 | ||
1347 | For each cell type it is generally up to the implementation of that type | |
1348 | which of the corresponding cell entries hold Scheme objects and which | |
1349 | hold raw C values. However, there is one basic rule that has to be | |
1350 | followed: Scheme pairs consist of exactly two cell entries, which both | |
1351 | contain Scheme objects. Further, a cell which contains a Scheme object | |
1352 | in it first entry has to be a Scheme pair. In other words, it is not | |
1353 | allowed to store a Scheme object in the first cell entry and a non | |
1354 | Scheme object in the second cell entry. | |
1355 | ||
1356 | @c Fixme:shouldn't this rather be SCM_PAIRP / SCM_PAIR_P ? | |
1357 | @deftypefn Macro int SCM_CONSP (SCM @var{x}) | |
1358 | Determine, whether the Scheme object @var{x} is a Scheme pair, | |
1359 | i.e. whether @var{x} references a heap cell consisting of exactly two | |
1360 | entries, where both entries contain a Scheme object. In this case, both | |
1361 | entries will have to be accessed using the @code{SCM_CELL_OBJECT} | |
c4d0cddd NJ |
1362 | macros. On the contrary, if the @code{SCM_CONSP} predicate is not |
1363 | fulfilled, the first entry of the Scheme cell is guaranteed not to be a | |
1364 | Scheme value and thus the first cell entry must be accessed using the | |
505392ae NJ |
1365 | @code{SCM_CELL_WORD_0} macro. |
1366 | @end deftypefn | |
1367 | ||
1368 | ||
38a93523 NJ |
1369 | @node Defining New Types (Smobs) |
1370 | @section Defining New Types (Smobs) | |
1371 | ||
1372 | @dfn{Smobs} are Guile's mechanism for adding new non-immediate types to | |
1373 | the system.@footnote{The term ``smob'' was coined by Aubrey Jaffer, who | |
1374 | says it comes from ``small object'', referring to the fact that only the | |
1375 | @sc{cdr} and part of the @sc{car} of a smob's cell are available for | |
1376 | use.} To define a new smob type, the programmer provides Guile with | |
1377 | some essential information about the type --- how to print it, how to | |
1378 | garbage collect it, and so on --- and Guile returns a fresh type tag for | |
cee2ed4f MG |
1379 | use in the first word of new cells. The programmer can then use |
1380 | @code{scm_c_define_gsubr} to make a set of C functions that create and | |
38a93523 NJ |
1381 | operate on these objects visible to Scheme code. |
1382 | ||
1383 | (You can find a complete version of the example code used in this | |
1384 | section in the Guile distribution, in @file{doc/example-smob}. That | |
1385 | directory includes a makefile and a suitable @code{main} function, so | |
1386 | you can build a complete interactive Guile shell, extended with the | |
1387 | datatypes described here.) | |
1388 | ||
1389 | @menu | |
1390 | * Describing a New Type:: | |
1391 | * Creating Instances:: | |
85a9b4ed | 1392 | * Type checking:: |
38a93523 NJ |
1393 | * Garbage Collecting Smobs:: |
1394 | * A Common Mistake In Allocating Smobs:: | |
1395 | * Garbage Collecting Simple Smobs:: | |
1396 | * A Complete Example:: | |
1397 | @end menu | |
1398 | ||
1399 | @node Describing a New Type | |
1400 | @subsection Describing a New Type | |
1401 | ||
1402 | To define a new type, the programmer must write four functions to | |
1403 | manage instances of the type: | |
1404 | ||
1405 | @table @code | |
1406 | @item mark | |
1407 | Guile will apply this function to each instance of the new type it | |
1408 | encounters during garbage collection. This function is responsible for | |
1409 | telling the collector about any other non-immediate objects the object | |
1410 | refers to. The default smob mark function is to not mark any data. | |
1411 | @xref{Garbage Collecting Smobs}, for more details. | |
1412 | ||
1413 | @item free | |
e8f1ff71 NJ |
1414 | Guile will apply this function to each instance of the new type it could |
1415 | not find any live pointers to. The function should release all | |
38a93523 | 1416 | resources held by the object and return the number of bytes released. |
eabd8acf MV |
1417 | This is analogous to the Java finalization method-- it is invoked at |
1418 | an unspecified time (when garbage collection occurs) after the object | |
1419 | is dead. The default free function frees the smob data (if the size | |
1420 | of the struct passed to @code{scm_make_smob_type} is non-zero) using | |
1421 | @code{scm_gc_free}. @xref{Garbage Collecting Smobs}, for more | |
1422 | details. | |
38a93523 NJ |
1423 | |
1424 | @item print | |
1425 | @c GJB:FIXME:: @var{exp} and @var{port} need to refer to a prototype of | |
1426 | @c the print function.... where is that, or where should it go? | |
1427 | Guile will apply this function to each instance of the new type to print | |
1428 | the value, as for @code{display} or @code{write}. The function should | |
1429 | write a printed representation of @var{exp} on @var{port}, in accordance | |
1430 | with the parameters in @var{pstate}. (For more information on print | |
cee2ed4f MG |
1431 | states, see @ref{Port Data}.) The default print function prints |
1432 | @code{#<NAME ADDRESS>} where @code{NAME} is the first argument passed to | |
1433 | @code{scm_make_smob_type}. | |
38a93523 NJ |
1434 | |
1435 | @item equalp | |
1436 | If Scheme code asks the @code{equal?} function to compare two instances | |
1437 | of the same smob type, Guile calls this function. It should return | |
1438 | @code{SCM_BOOL_T} if @var{a} and @var{b} should be considered | |
1439 | @code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is | |
1440 | @code{NULL}, @code{equal?} will assume that two instances of this type are | |
1441 | never @code{equal?} unless they are @code{eq?}. | |
1442 | ||
1443 | @end table | |
1444 | ||
1445 | To actually register the new smob type, call @code{scm_make_smob_type}: | |
1446 | ||
9d5315b6 | 1447 | @deftypefun scm_t_bits scm_make_smob_type (const char *name, size_t size) |
38a93523 NJ |
1448 | This function implements the standard way of adding a new smob type, |
1449 | named @var{name}, with instance size @var{size}, to the system. The | |
1450 | return value is a tag that is used in creating instances of the type. | |
1451 | If @var{size} is 0, then no memory will be allocated when instances of | |
1452 | the smob are created, and nothing will be freed by the default free | |
1453 | function. Default values are provided for mark, free, print, and, | |
1454 | equalp, as described above. If you want to customize any of these | |
1455 | functions, the call to @code{scm_make_smob_type} should be immediately | |
1456 | followed by calls to one or several of @code{scm_set_smob_mark}, | |
1457 | @code{scm_set_smob_free}, @code{scm_set_smob_print}, and/or | |
1458 | @code{scm_set_smob_equalp}. | |
1459 | @end deftypefun | |
1460 | ||
1461 | Each of the below @code{scm_set_smob_XXX} functions registers a smob | |
1462 | special function for a given type. Each function is intended to be used | |
1463 | only zero or one time per type, and the call should be placed | |
1464 | immediately following the call to @code{scm_make_smob_type}. | |
1465 | ||
9d5315b6 | 1466 | @deftypefun void scm_set_smob_mark (scm_t_bits tc, SCM (*mark) (SCM)) |
38a93523 NJ |
1467 | This function sets the smob marking procedure for the smob type specified by |
1468 | the tag @var{tc}. @var{tc} is the tag returned by @code{scm_make_smob_type}. | |
1469 | @end deftypefun | |
1470 | ||
9d5315b6 | 1471 | @deftypefun void scm_set_smob_free (scm_t_bits tc, size_t (*free) (SCM)) |
38a93523 NJ |
1472 | This function sets the smob freeing procedure for the smob type specified by |
1473 | the tag @var{tc}. @var{tc} is the tag returned by @code{scm_make_smob_type}. | |
1474 | @end deftypefun | |
1475 | ||
9d5315b6 | 1476 | @deftypefun void scm_set_smob_print (scm_t_bits tc, int (*print) (SCM, SCM, scm_print_state*)) |
38a93523 NJ |
1477 | This function sets the smob printing procedure for the smob type specified by |
1478 | the tag @var{tc}. @var{tc} is the tag returned by @code{scm_make_smob_type}. | |
1479 | @end deftypefun | |
1480 | ||
9d5315b6 | 1481 | @deftypefun void scm_set_smob_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM)) |
38a93523 NJ |
1482 | This function sets the smob equality-testing predicate for the smob type specified by |
1483 | the tag @var{tc}. @var{tc} is the tag returned by @code{scm_make_smob_type}. | |
1484 | @end deftypefun | |
1485 | ||
cee2ed4f MG |
1486 | In versions 1.4 and earlier, there was another way of creating smob |
1487 | types, using @code{scm_make_smob_type_mfpe}. This function is now | |
1488 | deprecated and will be removed in a future version of Guile. You should | |
1489 | use the mechanism described above for new code, and change old code not | |
1490 | to use deprecated features. | |
1491 | ||
38a93523 NJ |
1492 | Instead of using @code{scm_make_smob_type} and calling each of the |
1493 | individual @code{scm_set_smob_XXX} functions to register each special | |
cee2ed4f | 1494 | function independently, you could use @code{scm_make_smob_type_mfpe} to |
38a93523 | 1495 | register all of the special functions at once as you create the smob |
cee2ed4f | 1496 | type |
38a93523 | 1497 | |
cee2ed4f | 1498 | @deftypefun long scm_make_smob_type_mfpe(const char *name, size_t size, SCM (*mark) (SCM), size_t (*free) (SCM), int (*print) (SCM, SCM, scm_print_state*), SCM (*equalp) (SCM, SCM)) |
38a93523 NJ |
1499 | This function invokes @code{scm_make_smob_type} on its first two arguments |
1500 | to add a new smob type named @var{name}, with instance size @var{size} to the system. | |
1501 | It also registers the @var{mark}, @var{free}, @var{print}, @var{equalp} smob | |
1502 | special functions for that new type. Any of these parameters can be @code{NULL} | |
85a9b4ed | 1503 | to have that special function use the default behavior for guile. |
38a93523 NJ |
1504 | The return value is a tag that is used in creating instances of the type. If @var{size} |
1505 | is 0, then no memory will be allocated when instances of the smob are created, and | |
1506 | nothing will be freed by the default free function. | |
1507 | @end deftypefun | |
1508 | ||
1509 | For example, here is how one might declare and register a new type | |
85a9b4ed | 1510 | representing eight-bit gray-scale images: |
cee2ed4f | 1511 | |
38a93523 NJ |
1512 | @example |
1513 | #include <libguile.h> | |
1514 | ||
9d5315b6 | 1515 | static scm_t_bits image_tag; |
38a93523 NJ |
1516 | |
1517 | void | |
cee2ed4f | 1518 | init_image_type (void) |
38a93523 | 1519 | @{ |
bd5e6840 NJ |
1520 | image_tag = scm_make_smob_type ("image", sizeof (struct image)); |
1521 | scm_set_smob_mark (image_tag, mark_image); | |
1522 | scm_set_smob_free (image_tag, free_image); | |
1523 | scm_set_smob_print (image_tag, print_image); | |
38a93523 NJ |
1524 | @} |
1525 | @end example | |
1526 | ||
1527 | ||
1528 | @node Creating Instances | |
1529 | @subsection Creating Instances | |
1530 | ||
cee2ed4f MG |
1531 | Like other non-immediate types, smobs start with a cell whose first word |
1532 | contains typing information, and whose remaining words are free for any | |
1533 | use. | |
1534 | ||
1535 | After the header word containing the type code, smobs can have either | |
1536 | one, two or three additional words of data. These words store either a | |
1537 | pointer to the internal C structure holding the smob-specific data, or | |
1538 | the smob data itself. To create an instance of a smob type following | |
1539 | these standards, you should use @code{SCM_NEWSMOB}, @code{SCM_NEWSMOB2} | |
1540 | or @code{SCM_NEWSMOB3}:@footnote{The @code{SCM_NEWSMOB2} and | |
1541 | @code{SCM_NEWSMOB3} variants will allocate double cells and thus use | |
1542 | twice as much memory as smobs created by @code{SCM_NEWSMOB}.} | |
1543 | ||
9d5315b6 MV |
1544 | @deftypefn Macro void SCM_NEWSMOB(SCM value, scm_t_bits tag, void *data) |
1545 | @deftypefnx Macro void SCM_NEWSMOB2(SCM value, scm_t_bits tag, void *data1, void *data2) | |
1546 | @deftypefnx Macro void SCM_NEWSMOB3(SCM value, scm_t_bits tag, void *data1, void *data2, void *data3) | |
38a93523 | 1547 | Make @var{value} contain a smob instance of the type with tag @var{tag} |
cee2ed4f MG |
1548 | and smob data @var{data} (or @var{data1}, @var{data2}, and @var{data3}). |
1549 | @var{value} must be previously declared as C type @code{SCM}. | |
38a93523 NJ |
1550 | @end deftypefn |
1551 | ||
1552 | Since it is often the case (e.g., in smob constructors) that you will | |
1553 | create a smob instance and return it, there is also a slightly specialized | |
1554 | macro for this situation: | |
1555 | ||
9d5315b6 MV |
1556 | @deftypefn Macro fn_returns SCM_RETURN_NEWSMOB(scm_t_bits tag, void *data) |
1557 | @deftypefnx Macro fn_returns SCM_RETURN_NEWSMOB2(scm_t_bits tag, void *data1, void *data2) | |
1558 | @deftypefnx Macro fn_returns SCM_RETURN_NEWSMOB3(scm_t_bits tag, void *data1, void *data2, void *data3) | |
38a93523 | 1559 | This macro expands to a block of code that creates a smob instance of |
cee2ed4f | 1560 | the type with tag @var{tag} and smob data @var{data} (or @var{data1}, |
eabd8acf MV |
1561 | @var{data2}, and @var{data3}), and causes the surrounding function to |
1562 | return that @code{SCM} value. It should be the last piece of code in | |
1563 | a block. | |
38a93523 NJ |
1564 | @end deftypefn |
1565 | ||
eabd8acf MV |
1566 | Guile provides some functions for managing memory, which are often |
1567 | helpful when implementing smobs. @xref{Memory Blocks}. | |
38a93523 NJ |
1568 | |
1569 | ||
1570 | Continuing the above example, if the global variable @code{image_tag} | |
bd5e6840 NJ |
1571 | contains a tag returned by @code{scm_make_smob_type}, here is how we |
1572 | could construct a smob whose @sc{cdr} contains a pointer to a freshly | |
38a93523 NJ |
1573 | allocated @code{struct image}: |
1574 | ||
1575 | @example | |
1576 | struct image @{ | |
1577 | int width, height; | |
1578 | char *pixels; | |
1579 | ||
1580 | /* The name of this image */ | |
1581 | SCM name; | |
1582 | ||
1583 | /* A function to call when this image is | |
1584 | modified, e.g., to update the screen, | |
1585 | or SCM_BOOL_F if no action necessary */ | |
1586 | SCM update_func; | |
1587 | @}; | |
1588 | ||
1589 | SCM | |
1590 | make_image (SCM name, SCM s_width, SCM s_height) | |
1591 | @{ | |
1592 | struct image *image; | |
1593 | int width, height; | |
1594 | ||
bd5e6840 | 1595 | SCM_ASSERT (SCM_STRINGP (name), name, SCM_ARG1, "make-image"); |
38a93523 NJ |
1596 | SCM_ASSERT (SCM_INUMP (s_width), s_width, SCM_ARG2, "make-image"); |
1597 | SCM_ASSERT (SCM_INUMP (s_height), s_height, SCM_ARG3, "make-image"); | |
1598 | ||
1599 | width = SCM_INUM (s_width); | |
1600 | height = SCM_INUM (s_height); | |
1601 | ||
eabd8acf | 1602 | image = (struct image *) scm_gc_malloc (sizeof (struct image), "image"); |
38a93523 NJ |
1603 | image->width = width; |
1604 | image->height = height; | |
eabd8acf | 1605 | image->pixels = scm_gc_malloc (width * height, "image pixels"); |
38a93523 NJ |
1606 | image->name = name; |
1607 | image->update_func = SCM_BOOL_F; | |
1608 | ||
1609 | SCM_RETURN_NEWSMOB (image_tag, image); | |
1610 | @} | |
1611 | @end example | |
1612 | ||
413d32b6 | 1613 | |
85a9b4ed TTN |
1614 | @node Type checking |
1615 | @subsection Type checking | |
38a93523 NJ |
1616 | |
1617 | Functions that operate on smobs should aggressively check the types of | |
1618 | their arguments, to avoid misinterpreting some other datatype as a smob, | |
1619 | and perhaps causing a segmentation fault. Fortunately, this is pretty | |
1620 | simple to do. The function need only verify that its argument is a | |
cee2ed4f | 1621 | non-immediate, whose first word is the type tag returned by |
bd5e6840 | 1622 | @code{scm_make_smob_type}. |
38a93523 NJ |
1623 | |
1624 | For example, here is a simple function that operates on an image smob, | |
1625 | and checks the type of its argument. We also present an expanded | |
1626 | version of the @code{init_image_type} function, to make | |
1627 | @code{clear_image} and the image constructor function @code{make_image} | |
1628 | visible to Scheme code. | |
cee2ed4f | 1629 | |
38a93523 NJ |
1630 | @example |
1631 | SCM | |
1632 | clear_image (SCM image_smob) | |
1633 | @{ | |
1634 | int area; | |
1635 | struct image *image; | |
1636 | ||
1637 | SCM_ASSERT (SCM_SMOB_PREDICATE (image_tag, image_smob), | |
1638 | image_smob, SCM_ARG1, "clear-image"); | |
1639 | ||
1640 | image = (struct image *) SCM_SMOB_DATA (image_smob); | |
1641 | area = image->width * image->height; | |
1642 | memset (image->pixels, 0, area); | |
1643 | ||
1644 | /* Invoke the image's update function. */ | |
1645 | if (image->update_func != SCM_BOOL_F) | |
1646 | scm_apply (image->update_func, SCM_EOL, SCM_EOL); | |
1647 | ||
1648 | return SCM_UNSPECIFIED; | |
1649 | @} | |
1650 | ||
1651 | ||
1652 | void | |
cee2ed4f | 1653 | init_image_type (void) |
38a93523 | 1654 | @{ |
bd5e6840 NJ |
1655 | image_tag = scm_make_smob_type ("image", sizeof (struct image)); |
1656 | scm_set_smob_mark (image_tag, mark_image); | |
1657 | scm_set_smob_free (image_tag, free_image); | |
1658 | scm_set_smob_print (image_tag, print_image); | |
38a93523 | 1659 | |
cee2ed4f MG |
1660 | scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image); |
1661 | scm_c_define_gsubr ("make-image", 3, 0, 0, make_image); | |
38a93523 NJ |
1662 | @} |
1663 | @end example | |
1664 | ||
38a93523 NJ |
1665 | @c GJB:FIXME:: should talk about guile-snarf somewhere! |
1666 | ||
cee2ed4f | 1667 | |
38a93523 NJ |
1668 | @node Garbage Collecting Smobs |
1669 | @subsection Garbage Collecting Smobs | |
1670 | ||
1671 | Once a smob has been released to the tender mercies of the Scheme | |
1672 | system, it must be prepared to survive garbage collection. Guile calls | |
1673 | the @code{mark} and @code{free} functions of the @code{scm_smobfuns} | |
1674 | structure to manage this. | |
1675 | ||
1676 | As described before (@pxref{Conservative GC}), every object in the | |
1677 | Scheme system has a @dfn{mark bit}, which the garbage collector uses to | |
1678 | tell live objects from dead ones. When collection starts, every | |
1679 | object's mark bit is clear. The collector traces pointers through the | |
1680 | heap, starting from objects known to be live, and sets the mark bit on | |
1681 | each object it encounters. When it can find no more unmarked objects, | |
1682 | the collector walks all objects, live and dead, frees those whose mark | |
1683 | bits are still clear, and clears the mark bit on the others. | |
1684 | ||
1685 | The two main portions of the collection are called the @dfn{mark phase}, | |
1686 | during which the collector marks live objects, and the @dfn{sweep | |
1687 | phase}, during which the collector frees all unmarked objects. | |
1688 | ||
cee2ed4f MG |
1689 | The mark bit of a smob lives in a special memory region. When the |
1690 | collector encounters a smob, it sets the smob's mark bit, and uses the | |
1691 | smob's type tag to find the appropriate @code{mark} function for that | |
1692 | smob: the one listed in that smob's @code{scm_smobfuns} structure. It | |
1693 | then calls the @code{mark} function, passing it the smob as its only | |
1694 | argument. | |
38a93523 NJ |
1695 | |
1696 | The @code{mark} function is responsible for marking any other Scheme | |
1697 | objects the smob refers to. If it does not do so, the objects' mark | |
1698 | bits will still be clear when the collector begins to sweep, and the | |
1699 | collector will free them. If this occurs, it will probably break, or at | |
1700 | least confuse, any code operating on the smob; the smob's @code{SCM} | |
1701 | values will have become dangling references. | |
1702 | ||
1703 | To mark an arbitrary Scheme object, the @code{mark} function may call | |
1704 | this function: | |
1705 | ||
1706 | @deftypefun void scm_gc_mark (SCM @var{x}) | |
1707 | Mark the object @var{x}, and recurse on any objects @var{x} refers to. | |
1708 | If @var{x}'s mark bit is already set, return immediately. | |
1709 | @end deftypefun | |
1710 | ||
1711 | Thus, here is how we might write the @code{mark} function for the image | |
1712 | smob type discussed above: | |
cee2ed4f | 1713 | |
38a93523 NJ |
1714 | @example |
1715 | @group | |
1716 | SCM | |
1717 | mark_image (SCM image_smob) | |
1718 | @{ | |
1719 | /* Mark the image's name and update function. */ | |
1720 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); | |
1721 | ||
1722 | scm_gc_mark (image->name); | |
1723 | scm_gc_mark (image->update_func); | |
1724 | ||
1725 | return SCM_BOOL_F; | |
1726 | @} | |
1727 | @end group | |
1728 | @end example | |
1729 | ||
1730 | Note that, even though the image's @code{update_func} could be an | |
1731 | arbitrarily complex structure (representing a procedure and any values | |
1732 | enclosed in its environment), @code{scm_gc_mark} will recurse as | |
1733 | necessary to mark all its components. Because @code{scm_gc_mark} sets | |
1734 | an object's mark bit before it recurses, it is not confused by | |
1735 | circular structures. | |
1736 | ||
1737 | As an optimization, the collector will mark whatever value is returned | |
1738 | by the @code{mark} function; this helps limit depth of recursion during | |
1739 | the mark phase. Thus, the code above could also be written as: | |
1740 | @example | |
1741 | @group | |
1742 | SCM | |
1743 | mark_image (SCM image_smob) | |
1744 | @{ | |
1745 | /* Mark the image's name and update function. */ | |
1746 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); | |
1747 | ||
1748 | scm_gc_mark (image->name); | |
1749 | return image->update_func; | |
1750 | @} | |
1751 | @end group | |
1752 | @end example | |
1753 | ||
1754 | ||
1755 | Finally, when the collector encounters an unmarked smob during the sweep | |
1756 | phase, it uses the smob's tag to find the appropriate @code{free} | |
1757 | function for the smob. It then calls the function, passing it the smob | |
1758 | as its only argument. | |
1759 | ||
1760 | The @code{free} function must release any resources used by the smob. | |
1761 | However, it need not free objects managed by the collector; the | |
eabd8acf MV |
1762 | collector will take care of them. For historical reasons, the return |
1763 | type of the @code{free} function should be @code{size_t}, an unsigned | |
1764 | integral type; the @code{free} function should always return zero. | |
38a93523 NJ |
1765 | |
1766 | Here is how we might write the @code{free} function for the image smob | |
1767 | type: | |
1768 | @example | |
cee2ed4f | 1769 | size_t |
38a93523 NJ |
1770 | free_image (SCM image_smob) |
1771 | @{ | |
1772 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); | |
38a93523 | 1773 | |
eabd8acf MV |
1774 | scm_gc_free (image->pixels, image->width * image->height, "image pixels"); |
1775 | scm_gc_free (image, sizeof (struct image), "image"); | |
38a93523 | 1776 | |
eabd8acf | 1777 | return 0; |
38a93523 NJ |
1778 | @} |
1779 | @end example | |
1780 | ||
1781 | During the sweep phase, the garbage collector will clear the mark bits | |
1782 | on all live objects. The code which implements a smob need not do this | |
1783 | itself. | |
1784 | ||
1785 | There is no way for smob code to be notified when collection is | |
1786 | complete. | |
1787 | ||
38a93523 NJ |
1788 | It is usually a good idea to minimize the amount of processing done |
1789 | during garbage collection; keep @code{mark} and @code{free} functions | |
1790 | very simple. Since collections occur at unpredictable times, it is easy | |
1791 | for any unusual activity to interfere with normal code. | |
1792 | ||
1793 | ||
1794 | @node A Common Mistake In Allocating Smobs, Garbage Collecting Simple Smobs, Garbage Collecting Smobs, Defining New Types (Smobs) | |
1795 | @subsection A Common Mistake In Allocating Smobs | |
1796 | ||
1797 | When constructing new objects, you must be careful that the garbage | |
1798 | collector can always find any new objects you allocate. For example, | |
1799 | suppose we wrote the @code{make_image} function this way: | |
1800 | ||
1801 | @example | |
1802 | SCM | |
1803 | make_image (SCM name, SCM s_width, SCM s_height) | |
1804 | @{ | |
1805 | struct image *image; | |
1806 | SCM image_smob; | |
1807 | int width, height; | |
1808 | ||
bd5e6840 | 1809 | SCM_ASSERT (SCM_STRINGP (name), name, SCM_ARG1, "make-image"); |
38a93523 NJ |
1810 | SCM_ASSERT (SCM_INUMP (s_width), s_width, SCM_ARG2, "make-image"); |
1811 | SCM_ASSERT (SCM_INUMP (s_height), s_height, SCM_ARG3, "make-image"); | |
1812 | ||
1813 | width = SCM_INUM (s_width); | |
1814 | height = SCM_INUM (s_height); | |
1815 | ||
eabd8acf | 1816 | image = (struct image *) scm_gc_malloc (sizeof (struct image), "image"); |
38a93523 NJ |
1817 | image->width = width; |
1818 | image->height = height; | |
eabd8acf | 1819 | image->pixels = scm_gc_malloc (width * height, "image pixels"); |
38a93523 NJ |
1820 | |
1821 | /* THESE TWO LINES HAVE CHANGED: */ | |
1822 | image->name = scm_string_copy (name); | |
cee2ed4f | 1823 | image->update_func = scm_c_define_gsubr (@dots{}); |
38a93523 NJ |
1824 | |
1825 | SCM_NEWCELL (image_smob); | |
cee2ed4f MG |
1826 | SCM_SET_CELL_WORD_1 (image_smob, image); |
1827 | SCM_SET_CELL_TYPE (image_smob, image_tag); | |
38a93523 NJ |
1828 | |
1829 | return image_smob; | |
1830 | @} | |
1831 | @end example | |
1832 | ||
1833 | This code is incorrect. The calls to @code{scm_string_copy} and | |
cee2ed4f MG |
1834 | @code{scm_c_define_gsubr} allocate fresh objects. Allocating any new object |
1835 | may cause the garbage collector to run. If @code{scm_c_define_gsubr} | |
38a93523 NJ |
1836 | invokes a collection, the garbage collector has no way to discover that |
1837 | @code{image->name} points to the new string object; the @code{image} | |
1838 | structure is not yet part of any Scheme object, so the garbage collector | |
1839 | will not traverse it. Since the garbage collector cannot find any | |
1840 | references to the new string object, it will free it, leaving | |
1841 | @code{image} pointing to a dead object. | |
1842 | ||
1843 | A correct implementation might say, instead: | |
cee2ed4f | 1844 | |
38a93523 NJ |
1845 | @example |
1846 | image->name = SCM_BOOL_F; | |
1847 | image->update_func = SCM_BOOL_F; | |
1848 | ||
1849 | SCM_NEWCELL (image_smob); | |
cee2ed4f MG |
1850 | SCM_SET_CELL_WORD_1 (image_smob, image); |
1851 | SCM_SET_CELL_TYPE (image_smob, image_tag); | |
38a93523 NJ |
1852 | |
1853 | image->name = scm_string_copy (name); | |
cee2ed4f | 1854 | image->update_func = scm_c_define_gsubr (@dots{}); |
38a93523 NJ |
1855 | |
1856 | return image_smob; | |
1857 | @end example | |
1858 | ||
1859 | Now, by the time we allocate the new string and function objects, | |
1860 | @code{image_smob} points to @code{image}. If the garbage collector | |
1861 | scans the stack, it will find a reference to @code{image_smob} and | |
1862 | traverse @code{image}, so any objects @code{image} points to will be | |
1863 | preserved. | |
1864 | ||
1865 | ||
1866 | @node Garbage Collecting Simple Smobs, A Complete Example, A Common Mistake In Allocating Smobs, Defining New Types (Smobs) | |
1867 | @subsection Garbage Collecting Simple Smobs | |
1868 | ||
1869 | It is often useful to define very simple smob types --- smobs which have | |
cee2ed4f MG |
1870 | no data to mark, other than the cell itself, or smobs whose first data |
1871 | word is simply an ordinary Scheme object, to be marked recursively. | |
1872 | Guile provides some functions to handle these common cases; you can use | |
1873 | this function as your smob type's @code{mark} function, if your smob's | |
38a93523 NJ |
1874 | structure is simple enough. |
1875 | ||
1876 | If the smob refers to no other Scheme objects, then no action is | |
1877 | necessary; the garbage collector has already marked the smob cell | |
1878 | itself. In that case, you can use zero as your mark function. | |
1879 | ||
1880 | @deftypefun SCM scm_markcdr (SCM @var{x}) | |
cee2ed4f MG |
1881 | Mark the references in the smob @var{x}, assuming that @var{x}'s first |
1882 | data word contains an ordinary Scheme object, and @var{x} refers to no | |
1883 | other objects. This function simply returns @var{x}'s first data word. | |
1884 | ||
1885 | This is only useful for simple smobs created by @code{SCM_NEWSMOB} or | |
1886 | @code{SCM_RETURN_NEWSMOB}, not for smobs allocated as double cells. | |
38a93523 NJ |
1887 | @end deftypefun |
1888 | ||
cee2ed4f | 1889 | @deftypefun size_t scm_free0 (SCM @var{x}) |
38a93523 NJ |
1890 | Do nothing; return zero. This function is appropriate for smobs that |
1891 | use either zero or @code{scm_markcdr} as their marking functions, and | |
1892 | refer to no heap storage, including memory managed by @code{malloc}, | |
1893 | other than the smob's header cell. | |
cee2ed4f MG |
1894 | |
1895 | This function should not be needed anymore, because simply passing | |
1896 | @code{NULL} as the free function does the same. | |
38a93523 NJ |
1897 | @end deftypefun |
1898 | ||
1899 | ||
1900 | @node A Complete Example | |
1901 | @subsection A Complete Example | |
1902 | ||
1903 | Here is the complete text of the implementation of the image datatype, | |
1904 | as presented in the sections above. We also provide a definition for | |
1905 | the smob's @code{print} function, and make some objects and functions | |
1906 | static, to clarify exactly what the surrounding code is using. | |
1907 | ||
1908 | As mentioned above, you can find this code in the Guile distribution, in | |
1909 | @file{doc/example-smob}. That directory includes a makefile and a | |
1910 | suitable @code{main} function, so you can build a complete interactive | |
1911 | Guile shell, extended with the datatypes described here.) | |
1912 | ||
1913 | @example | |
1914 | /* file "image-type.c" */ | |
1915 | ||
1916 | #include <stdlib.h> | |
1917 | #include <libguile.h> | |
1918 | ||
9d5315b6 | 1919 | static scm_t_bits image_tag; |
38a93523 NJ |
1920 | |
1921 | struct image @{ | |
1922 | int width, height; | |
1923 | char *pixels; | |
1924 | ||
1925 | /* The name of this image */ | |
1926 | SCM name; | |
1927 | ||
1928 | /* A function to call when this image is | |
1929 | modified, e.g., to update the screen, | |
1930 | or SCM_BOOL_F if no action necessary */ | |
1931 | SCM update_func; | |
1932 | @}; | |
1933 | ||
1934 | static SCM | |
1935 | make_image (SCM name, SCM s_width, SCM s_height) | |
1936 | @{ | |
1937 | struct image *image; | |
38a93523 NJ |
1938 | int width, height; |
1939 | ||
bd5e6840 | 1940 | SCM_ASSERT (SCM_STRINGP (name), name, SCM_ARG1, "make-image"); |
38a93523 NJ |
1941 | SCM_ASSERT (SCM_INUMP (s_width), s_width, SCM_ARG2, "make-image"); |
1942 | SCM_ASSERT (SCM_INUMP (s_height), s_height, SCM_ARG3, "make-image"); | |
1943 | ||
1944 | width = SCM_INUM (s_width); | |
1945 | height = SCM_INUM (s_height); | |
1946 | ||
eabd8acf | 1947 | image = (struct image *) scm_gc_malloc (sizeof (struct image), "image"); |
38a93523 NJ |
1948 | image->width = width; |
1949 | image->height = height; | |
eabd8acf | 1950 | image->pixels = scm_gc_malloc (width * height, "image pixels"); |
38a93523 NJ |
1951 | image->name = name; |
1952 | image->update_func = SCM_BOOL_F; | |
1953 | ||
bd5e6840 | 1954 | SCM_RETURN_NEWSMOB (image_tag, image); |
38a93523 NJ |
1955 | @} |
1956 | ||
1957 | static SCM | |
1958 | clear_image (SCM image_smob) | |
1959 | @{ | |
1960 | int area; | |
1961 | struct image *image; | |
1962 | ||
1963 | SCM_ASSERT (SCM_SMOB_PREDICATE (image_tag, image_smob), | |
1964 | image_smob, SCM_ARG1, "clear-image"); | |
1965 | ||
1966 | image = (struct image *) SCM_SMOB_DATA (image_smob); | |
1967 | area = image->width * image->height; | |
1968 | memset (image->pixels, 0, area); | |
1969 | ||
1970 | /* Invoke the image's update function. */ | |
1971 | if (image->update_func != SCM_BOOL_F) | |
1972 | scm_apply (image->update_func, SCM_EOL, SCM_EOL); | |
1973 | ||
1974 | return SCM_UNSPECIFIED; | |
1975 | @} | |
1976 | ||
1977 | static SCM | |
1978 | mark_image (SCM image_smob) | |
1979 | @{ | |
bd5e6840 | 1980 | /* Mark the image's name and update function. */ |
38a93523 NJ |
1981 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); |
1982 | ||
1983 | scm_gc_mark (image->name); | |
1984 | return image->update_func; | |
1985 | @} | |
1986 | ||
cee2ed4f | 1987 | static size_t |
38a93523 NJ |
1988 | free_image (SCM image_smob) |
1989 | @{ | |
1990 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); | |
38a93523 | 1991 | |
eabd8acf MV |
1992 | scm_gc_free (image->pixels, image->width * image->height, "image pixels"); |
1993 | scm_gc_free (image, sizeof (struct image), "image"); | |
38a93523 | 1994 | |
eabd8acf | 1995 | return 0; |
38a93523 NJ |
1996 | @} |
1997 | ||
1998 | static int | |
1999 | print_image (SCM image_smob, SCM port, scm_print_state *pstate) | |
2000 | @{ | |
2001 | struct image *image = (struct image *) SCM_SMOB_DATA (image_smob); | |
2002 | ||
2003 | scm_puts ("#<image ", port); | |
2004 | scm_display (image->name, port); | |
2005 | scm_puts (">", port); | |
2006 | ||
2007 | /* non-zero means success */ | |
2008 | return 1; | |
2009 | @} | |
2010 | ||
38a93523 | 2011 | void |
cee2ed4f | 2012 | init_image_type (void) |
38a93523 | 2013 | @{ |
bd5e6840 NJ |
2014 | image_tag = scm_make_smob_type ("image", sizeof (struct image)); |
2015 | scm_set_smob_mark (image_tag, mark_image); | |
2016 | scm_set_smob_free (image_tag, free_image); | |
2017 | scm_set_smob_print (image_tag, print_image); | |
38a93523 | 2018 | |
cee2ed4f MG |
2019 | scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image); |
2020 | scm_c_define_gsubr ("make-image", 3, 0, 0, make_image); | |
38a93523 NJ |
2021 | @} |
2022 | @end example | |
2023 | ||
2024 | Here is a sample build and interaction with the code from the | |
2025 | @file{example-smob} directory, on the author's machine: | |
2026 | ||
2027 | @example | |
2028 | zwingli:example-smob$ make CC=gcc | |
2029 | gcc `guile-config compile` -c image-type.c -o image-type.o | |
2030 | gcc `guile-config compile` -c myguile.c -o myguile.o | |
2031 | gcc image-type.o myguile.o `guile-config link` -o myguile | |
2032 | zwingli:example-smob$ ./myguile | |
2033 | guile> make-image | |
2034 | #<primitive-procedure make-image> | |
2035 | guile> (define i (make-image "Whistler's Mother" 100 100)) | |
2036 | guile> i | |
2037 | #<image Whistler's Mother> | |
2038 | guile> (clear-image i) | |
2039 | guile> (clear-image 4) | |
2040 | ERROR: In procedure clear-image in expression (clear-image 4): | |
2041 | ERROR: Wrong type argument in position 1: 4 | |
2042 | ABORT: (wrong-type-arg) | |
2043 | ||
2044 | Type "(backtrace)" to get more information. | |
2045 | guile> | |
2046 | @end example | |
2047 | ||
2048 | @c essay @bye |