Commit | Line | Data |
---|---|---|
2da09c3f MV |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
0f7e6c56 | 3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2010 |
2da09c3f MV |
4 | @c Free Software Foundation, Inc. |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
0f7e6c56 AW |
7 | @node Data Representation |
8 | @section Data Representation | |
38a93523 NJ |
9 | |
10 | Scheme is a latently-typed language; this means that the system cannot, | |
11 | in general, determine the type of a given expression at compile time. | |
12 | Types only become apparent at run time. Variables do not have fixed | |
13 | types; a variable may hold a pair at one point, an integer at the next, | |
14 | and a thousand-element vector later. Instead, values, not variables, | |
15 | have fixed types. | |
16 | ||
17 | In order to implement standard Scheme functions like @code{pair?} and | |
18 | @code{string?} and provide garbage collection, the representation of | |
19 | every value must contain enough information to accurately determine its | |
20 | type at run time. Often, Scheme systems also use this information to | |
21 | determine whether a program has attempted to apply an operation to an | |
22 | inappropriately typed value (such as taking the @code{car} of a string). | |
23 | ||
24 | Because variables, pairs, and vectors may hold values of any type, | |
25 | Scheme implementations use a uniform representation for values --- a | |
26 | single type large enough to hold either a complete value or a pointer | |
27 | to a complete value, along with the necessary typing information. | |
28 | ||
29 | The following sections will present a simple typing system, and then | |
0f7e6c56 AW |
30 | make some refinements to correct its major weaknesses. We then conclude |
31 | with a discussion of specific choices that Guile has made regarding | |
32 | garbage collection and data representation. | |
38a93523 NJ |
33 | |
34 | @menu | |
35 | * A Simple Representation:: | |
36 | * Faster Integers:: | |
37 | * Cheaper Pairs:: | |
0f7e6c56 AW |
38 | * Conservative GC:: |
39 | * The SCM Type in Guile:: | |
38a93523 NJ |
40 | @end menu |
41 | ||
42 | @node A Simple Representation | |
43 | @subsection A Simple Representation | |
44 | ||
0f7e6c56 AW |
45 | The simplest way to represent Scheme values in C would be to represent |
46 | each value as a pointer to a structure containing a type indicator, | |
47 | followed by a union carrying the real value. Assuming that @code{SCM} is | |
48 | the name of our universal type, we can write: | |
38a93523 NJ |
49 | |
50 | @example | |
51 | enum type @{ integer, pair, string, vector, ... @}; | |
52 | ||
53 | typedef struct value *SCM; | |
54 | ||
55 | struct value @{ | |
56 | enum type type; | |
57 | union @{ | |
58 | int integer; | |
59 | struct @{ SCM car, cdr; @} pair; | |
60 | struct @{ int length; char *elts; @} string; | |
61 | struct @{ int length; SCM *elts; @} vector; | |
62 | ... | |
63 | @} value; | |
64 | @}; | |
65 | @end example | |
66 | with the ellipses replaced with code for the remaining Scheme types. | |
67 | ||
68 | This representation is sufficient to implement all of Scheme's | |
69 | semantics. If @var{x} is an @code{SCM} value: | |
70 | @itemize @bullet | |
71 | @item | |
72 | To test if @var{x} is an integer, we can write @code{@var{x}->type == integer}. | |
73 | @item | |
74 | To find its value, we can write @code{@var{x}->value.integer}. | |
75 | @item | |
76 | To test if @var{x} is a vector, we can write @code{@var{x}->type == vector}. | |
77 | @item | |
78 | If we know @var{x} is a vector, we can write | |
79 | @code{@var{x}->value.vector.elts[0]} to refer to its first element. | |
80 | @item | |
81 | If we know @var{x} is a pair, we can write | |
82 | @code{@var{x}->value.pair.car} to extract its car. | |
83 | @end itemize | |
84 | ||
85 | ||
86 | @node Faster Integers | |
87 | @subsection Faster Integers | |
88 | ||
89 | Unfortunately, the above representation has a serious disadvantage. In | |
90 | order to return an integer, an expression must allocate a @code{struct | |
91 | value}, initialize it to represent that integer, and return a pointer to | |
92 | it. Furthermore, fetching an integer's value requires a memory | |
93 | reference, which is much slower than a register reference on most | |
94 | processors. Since integers are extremely common, this representation is | |
95 | too costly, in both time and space. Integers should be very cheap to | |
96 | create and manipulate. | |
97 | ||
98 | One possible solution comes from the observation that, on many | |
0f7e6c56 AW |
99 | architectures, heap-allocated data (i.e., what you get when you call |
100 | @code{malloc}) must be aligned on an eight-byte boundary. (Whether or | |
101 | not the machine actually requires it, we can write our own allocator for | |
102 | @code{struct value} objects that assures this is true.) In this case, | |
103 | the lower three bits of the structure's address are known to be zero. | |
38a93523 NJ |
104 | |
105 | This gives us the room we need to provide an improved representation | |
106 | for integers. We make the following rules: | |
107 | @itemize @bullet | |
108 | @item | |
0f7e6c56 | 109 | If the lower three bits of an @code{SCM} value are zero, then the SCM |
38a93523 NJ |
110 | value is a pointer to a @code{struct value}, and everything proceeds as |
111 | before. | |
112 | @item | |
113 | Otherwise, the @code{SCM} value represents an integer, whose value | |
114 | appears in its upper bits. | |
115 | @end itemize | |
116 | ||
117 | Here is C code implementing this convention: | |
118 | @example | |
119 | enum type @{ pair, string, vector, ... @}; | |
120 | ||
121 | typedef struct value *SCM; | |
122 | ||
123 | struct value @{ | |
124 | enum type type; | |
125 | union @{ | |
126 | struct @{ SCM car, cdr; @} pair; | |
127 | struct @{ int length; char *elts; @} string; | |
128 | struct @{ int length; SCM *elts; @} vector; | |
129 | ... | |
130 | @} value; | |
131 | @}; | |
132 | ||
0f7e6c56 | 133 | #define POINTER_P(x) (((int) (x) & 7) == 0) |
38a93523 NJ |
134 | #define INTEGER_P(x) (! POINTER_P (x)) |
135 | ||
0f7e6c56 AW |
136 | #define GET_INTEGER(x) ((int) (x) >> 3) |
137 | #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1)) | |
38a93523 NJ |
138 | @end example |
139 | ||
140 | Notice that @code{integer} no longer appears as an element of @code{enum | |
141 | type}, and the union has lost its @code{integer} member. Instead, we | |
142 | use the @code{POINTER_P} and @code{INTEGER_P} macros to make a coarse | |
143 | classification of values into integers and non-integers, and do further | |
144 | type testing as before. | |
145 | ||
146 | Here's how we would answer the questions posed above (again, assume | |
147 | @var{x} is an @code{SCM} value): | |
148 | @itemize @bullet | |
149 | @item | |
150 | To test if @var{x} is an integer, we can write @code{INTEGER_P (@var{x})}. | |
151 | @item | |
152 | To find its value, we can write @code{GET_INTEGER (@var{x})}. | |
153 | @item | |
154 | To test if @var{x} is a vector, we can write: | |
155 | @example | |
156 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
157 | @end example | |
158 | Given the new representation, we must make sure @var{x} is truly a | |
159 | pointer before we dereference it to determine its complete type. | |
160 | @item | |
161 | If we know @var{x} is a vector, we can write | |
162 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
163 | before. | |
164 | @item | |
165 | If we know @var{x} is a pair, we can write | |
166 | @code{@var{x}->value.pair.car} to extract its car, just as before. | |
167 | @end itemize | |
168 | ||
169 | This representation allows us to operate more efficiently on integers | |
170 | than the first. For example, if @var{x} and @var{y} are known to be | |
171 | integers, we can compute their sum as follows: | |
172 | @example | |
173 | MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y})) | |
174 | @end example | |
0f7e6c56 AW |
175 | Now, integer math requires no allocation or memory references. Most real |
176 | Scheme systems actually implement addition and other operations using an | |
177 | even more efficient algorithm, but this essay isn't about | |
178 | bit-twiddling. (Hint: how do you decide when to overflow to a bignum? | |
179 | How would you do it in assembly?) | |
38a93523 NJ |
180 | |
181 | ||
182 | @node Cheaper Pairs | |
183 | @subsection Cheaper Pairs | |
184 | ||
0f7e6c56 AW |
185 | However, there is yet another issue to confront. Most Scheme heaps |
186 | contain more pairs than any other type of object; Jonathan Rees said at | |
187 | one point that pairs occupy 45% of the heap in his Scheme | |
188 | implementation, Scheme 48. However, our representation above spends | |
189 | three @code{SCM}-sized words per pair --- one for the type, and two for | |
190 | the @sc{car} and @sc{cdr}. Is there any way to represent pairs using | |
191 | only two words? | |
38a93523 NJ |
192 | |
193 | Let us refine the convention we established earlier. Let us assert | |
194 | that: | |
195 | @itemize @bullet | |
196 | @item | |
0f7e6c56 | 197 | If the bottom three bits of an @code{SCM} value are @code{#b000}, then |
38a93523 NJ |
198 | it is a pointer, as before. |
199 | @item | |
0f7e6c56 | 200 | If the bottom three bits are @code{#b001}, then the upper bits are an |
38a93523 NJ |
201 | integer. This is a bit more restrictive than before. |
202 | @item | |
0f7e6c56 AW |
203 | If the bottom two bits are @code{#b010}, then the value, with the bottom |
204 | three bits masked out, is the address of a pair. | |
38a93523 NJ |
205 | @end itemize |
206 | ||
207 | Here is the new C code: | |
208 | @example | |
209 | enum type @{ string, vector, ... @}; | |
210 | ||
211 | typedef struct value *SCM; | |
212 | ||
213 | struct value @{ | |
214 | enum type type; | |
215 | union @{ | |
216 | struct @{ int length; char *elts; @} string; | |
217 | struct @{ int length; SCM *elts; @} vector; | |
218 | ... | |
219 | @} value; | |
220 | @}; | |
221 | ||
222 | struct pair @{ | |
223 | SCM car, cdr; | |
224 | @}; | |
225 | ||
0f7e6c56 | 226 | #define POINTER_P(x) (((int) (x) & 7) == 0) |
38a93523 | 227 | |
0f7e6c56 AW |
228 | #define INTEGER_P(x) (((int) (x) & 7) == 1) |
229 | #define GET_INTEGER(x) ((int) (x) >> 3) | |
230 | #define MAKE_INTEGER(x) ((SCM) (((x) << 3) | 1)) | |
38a93523 | 231 | |
0f7e6c56 AW |
232 | #define PAIR_P(x) (((int) (x) & 7) == 2) |
233 | #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~7)) | |
38a93523 NJ |
234 | @end example |
235 | ||
236 | Notice that @code{enum type} and @code{struct value} now only contain | |
237 | provisions for vectors and strings; both integers and pairs have become | |
238 | special cases. The code above also assumes that an @code{int} is large | |
239 | enough to hold a pointer, which isn't generally true. | |
240 | ||
241 | ||
242 | Our list of examples is now as follows: | |
243 | @itemize @bullet | |
244 | @item | |
245 | To test if @var{x} is an integer, we can write @code{INTEGER_P | |
246 | (@var{x})}; this is as before. | |
247 | @item | |
248 | To find its value, we can write @code{GET_INTEGER (@var{x})}, as | |
249 | before. | |
250 | @item | |
251 | To test if @var{x} is a vector, we can write: | |
252 | @example | |
253 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
254 | @end example | |
255 | We must still make sure that @var{x} is a pointer to a @code{struct | |
256 | value} before dereferencing it to find its type. | |
257 | @item | |
258 | If we know @var{x} is a vector, we can write | |
259 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
260 | before. | |
261 | @item | |
262 | We can write @code{PAIR_P (@var{x})} to determine if @var{x} is a | |
263 | pair, and then write @code{GET_PAIR (@var{x})->car} to refer to its | |
264 | car. | |
265 | @end itemize | |
266 | ||
267 | This change in representation reduces our heap size by 15%. It also | |
268 | makes it cheaper to decide if a value is a pair, because no memory | |
269 | references are necessary; it suffices to check the bottom two bits of | |
270 | the @code{SCM} value. This may be significant when traversing lists, a | |
271 | common activity in a Scheme system. | |
272 | ||
85a9b4ed | 273 | Again, most real Scheme systems use a slightly different implementation; |
38a93523 NJ |
274 | for example, if GET_PAIR subtracts off the low bits of @code{x}, instead |
275 | of masking them off, the optimizer will often be able to combine that | |
276 | subtraction with the addition of the offset of the structure member we | |
277 | are referencing, making a modified pointer as fast to use as an | |
278 | unmodified pointer. | |
279 | ||
280 | ||
38a93523 NJ |
281 | @node Conservative GC |
282 | @subsection Conservative Garbage Collection | |
283 | ||
284 | Aside from the latent typing, the major source of constraints on a | |
285 | Scheme implementation's data representation is the garbage collector. | |
286 | The collector must be able to traverse every live object in the heap, to | |
0f7e6c56 AW |
287 | determine which objects are not live, and thus collectable. |
288 | ||
289 | There are many ways to implement this. Guile's garbage collection is | |
290 | built on a library, the Boehm-Demers-Weiser conservative garbage | |
291 | collector (BDW-GC). The BDW-GC ``just works'', for the most part. But | |
292 | since it is interesting to know how these things work, we include here a | |
293 | high-level description of what the BDW-GC does. | |
294 | ||
295 | Garbage collection has two logical phases: a @dfn{mark} phase, in which | |
296 | the set of live objects is enumerated, and a @dfn{sweep} phase, in which | |
297 | objects not traversed in the mark phase are collected. Correct | |
298 | functioning of the collector depends on being able to traverse the | |
299 | entire set of live objects. | |
300 | ||
301 | In the mark phase, the collector scans the system's global variables and | |
302 | the local variables on the stack to determine which objects are | |
303 | immediately accessible by the C code. It then scans those objects to | |
304 | find the objects they point to, and so on. The collector logically sets | |
305 | a @dfn{mark bit} on each object it finds, so each object is traversed | |
306 | only once. | |
38a93523 NJ |
307 | |
308 | When the collector can find no unmarked objects pointed to by marked | |
309 | objects, it assumes that any objects that are still unmarked will never | |
310 | be used by the program (since there is no path of dereferences from any | |
311 | global or local variable that reaches them) and deallocates them. | |
312 | ||
313 | In the above paragraphs, we did not specify how the garbage collector | |
314 | finds the global and local variables; as usual, there are many different | |
315 | approaches. Frequently, the programmer must maintain a list of pointers | |
316 | to all global variables that refer to the heap, and another list | |
317 | (adjusted upon entry to and exit from each function) of local variables, | |
318 | for the collector's benefit. | |
319 | ||
320 | The list of global variables is usually not too difficult to maintain, | |
0f7e6c56 | 321 | since global variables are relatively rare. However, an explicitly |
38a93523 | 322 | maintained list of local variables (in the author's personal experience) |
0f7e6c56 | 323 | is a nightmare to maintain. Thus, the BDW-GC uses a technique called |
38a93523 NJ |
324 | @dfn{conservative garbage collection}, to make the local variable list |
325 | unnecessary. | |
326 | ||
327 | The trick to conservative collection is to treat the stack as an | |
328 | ordinary range of memory, and assume that @emph{every} word on the stack | |
329 | is a pointer into the heap. Thus, the collector marks all objects whose | |
330 | addresses appear anywhere in the stack, without knowing for sure how | |
331 | that word is meant to be interpreted. | |
332 | ||
0f7e6c56 AW |
333 | In addition to the stack, the BDW-GC will also scan static data |
334 | sections. This means that global variables are also scanned when looking | |
335 | for live Scheme objects. | |
336 | ||
38a93523 | 337 | Obviously, such a system will occasionally retain objects that are |
0f7e6c56 AW |
338 | actually garbage, and should be freed. In practice, this is not a |
339 | problem. The alternative, an explicitly maintained list of local | |
38a93523 | 340 | variable addresses, is effectively much less reliable, due to programmer |
0f7e6c56 AW |
341 | error. Interested readers should see the BDW-GC web page at |
342 | @uref{http://www.hpl.hp.com/personal/Hans_Boehm/gc}, for more | |
343 | information. | |
38a93523 NJ |
344 | |
345 | ||
0f7e6c56 AW |
346 | @node The SCM Type in Guile |
347 | @subsection The SCM Type in Guile | |
38a93523 NJ |
348 | |
349 | Guile classifies Scheme objects into two kinds: those that fit entirely | |
350 | within an @code{SCM}, and those that require heap storage. | |
351 | ||
352 | The former class are called @dfn{immediates}. The class of immediates | |
353 | includes small integers, characters, boolean values, the empty list, the | |
354 | mysterious end-of-file object, and some others. | |
355 | ||
85a9b4ed | 356 | The remaining types are called, not surprisingly, @dfn{non-immediates}. |
38a93523 | 357 | They include pairs, procedures, strings, vectors, and all other data |
0f7e6c56 AW |
358 | types in Guile. For non-immediates, the @code{SCM} word contains a |
359 | pointer to data on the heap, with further information about the object | |
360 | in question is stored in that data. | |
38a93523 | 361 | |
0f7e6c56 AW |
362 | This section describes how the @code{SCM} type is actually represented |
363 | and used at the C level. Interested readers should see | |
364 | @code{libguile/tags.h} for an exposition of how Guile stores type | |
3f7e8708 | 365 | information. |
38a93523 | 366 | |
3229f68b MV |
367 | In fact, there are two basic C data types to represent objects in |
368 | Guile: @code{SCM} and @code{scm_t_bits}. | |
505392ae NJ |
369 | |
370 | @menu | |
9d5315b6 | 371 | * Relationship between SCM and scm_t_bits:: |
505392ae NJ |
372 | * Immediate objects:: |
373 | * Non-immediate objects:: | |
9d5315b6 | 374 | * Allocating Cells:: |
505392ae NJ |
375 | * Heap Cell Type Information:: |
376 | * Accessing Cell Entries:: | |
505392ae NJ |
377 | @end menu |
378 | ||
379 | ||
9d5315b6 MV |
380 | @node Relationship between SCM and scm_t_bits |
381 | @subsubsection Relationship between @code{SCM} and @code{scm_t_bits} | |
505392ae NJ |
382 | |
383 | A variable of type @code{SCM} is guaranteed to hold a valid Scheme | |
9d5315b6 | 384 | object. A variable of type @code{scm_t_bits}, on the other hand, may |
505392ae NJ |
385 | hold a representation of a @code{SCM} value as a C integral type, but |
386 | may also hold any C value, even if it does not correspond to a valid | |
387 | Scheme object. | |
388 | ||
389 | For a variable @var{x} of type @code{SCM}, the Scheme object's type | |
390 | information is stored in a form that is not directly usable. To be able | |
391 | to work on the type encoding of the scheme value, the @code{SCM} | |
392 | variable has to be transformed into the corresponding representation as | |
9d5315b6 | 393 | a @code{scm_t_bits} variable @var{y} by using the @code{SCM_UNPACK} |
505392ae | 394 | macro. Once this has been done, the type of the scheme object @var{x} |
9d5315b6 | 395 | can be derived from the content of the bits of the @code{scm_t_bits} |
505392ae NJ |
396 | value @var{y}, in the way illustrated by the example earlier in this |
397 | chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a | |
9d5315b6 | 398 | Scheme value as a @code{scm_t_bits} variable can be transformed into the |
505392ae NJ |
399 | corresponding @code{SCM} value using the @code{SCM_PACK} macro. |
400 | ||
505392ae NJ |
401 | @node Immediate objects |
402 | @subsubsection Immediate objects | |
403 | ||
679cceed | 404 | A Scheme object may either be an immediate, i.e.@: carrying all necessary |
505392ae NJ |
405 | information by itself, or it may contain a reference to a @dfn{cell} |
406 | with additional information on the heap. Although in general it should | |
407 | be irrelevant for user code whether an object is an immediate or not, | |
408 | within Guile's own code the distinction is sometimes of importance. | |
409 | Thus, the following low level macro is provided: | |
410 | ||
411 | @deftypefn Macro int SCM_IMP (SCM @var{x}) | |
412 | A Scheme object is an immediate if it fulfills the @code{SCM_IMP} | |
413 | predicate, otherwise it holds an encoded reference to a heap cell. The | |
414 | result of the predicate is delivered as a C style boolean value. User | |
415 | code and code that extends Guile should normally not be required to use | |
416 | this macro. | |
417 | @end deftypefn | |
418 | ||
419 | @noindent | |
420 | Summary: | |
421 | @itemize @bullet | |
422 | @item | |
423 | Given a Scheme object @var{x} of unknown type, check first | |
424 | with @code{SCM_IMP (@var{x})} if it is an immediate object. | |
425 | @item | |
426 | If so, all of the type and value information can be determined from the | |
9d5315b6 | 427 | @code{scm_t_bits} value that is delivered by @code{SCM_UNPACK |
505392ae NJ |
428 | (@var{x})}. |
429 | @end itemize | |
430 | ||
0f7e6c56 AW |
431 | There are a number of special values in Scheme, most of them documented |
432 | elsewhere in this manual. It's not quite the right place to put them, | |
433 | but for now, here's a list of the C names given to some of these values: | |
434 | ||
435 | @deftypefn Macro SCM SCM_EOL | |
436 | The Scheme empty list object, or ``End Of List'' object, usually written | |
437 | in Scheme as @code{'()}. | |
438 | @end deftypefn | |
439 | ||
440 | @deftypefn Macro SCM SCM_EOF_VAL | |
441 | The Scheme end-of-file value. It has no standard written | |
442 | representation, for obvious reasons. | |
443 | @end deftypefn | |
444 | ||
445 | @deftypefn Macro SCM SCM_UNSPECIFIED | |
f567a43c MW |
446 | The value returned by some (but not all) expressions that the Scheme |
447 | standard says return an ``unspecified'' value. | |
0f7e6c56 AW |
448 | |
449 | This is sort of a weirdly literal way to take things, but the standard | |
450 | read-eval-print loop prints nothing when the expression returns this | |
451 | value, so it's not a bad idea to return this when you can't think of | |
452 | anything else helpful. | |
453 | @end deftypefn | |
454 | ||
455 | @deftypefn Macro SCM SCM_UNDEFINED | |
456 | The ``undefined'' value. Its most important property is that is not | |
457 | equal to any valid Scheme value. This is put to various internal uses | |
458 | by C code interacting with Guile. | |
459 | ||
460 | For example, when you write a C function that is callable from Scheme | |
461 | and which takes optional arguments, the interpreter passes | |
462 | @code{SCM_UNDEFINED} for any arguments you did not receive. | |
463 | ||
464 | We also use this to mark unbound variables. | |
465 | @end deftypefn | |
466 | ||
467 | @deftypefn Macro int SCM_UNBNDP (SCM @var{x}) | |
468 | Return true if @var{x} is @code{SCM_UNDEFINED}. Note that this is not a | |
469 | check to see if @var{x} is @code{SCM_UNBOUND}. History will not be kind | |
470 | to us. | |
471 | @end deftypefn | |
472 | ||
505392ae NJ |
473 | |
474 | @node Non-immediate objects | |
475 | @subsubsection Non-immediate objects | |
476 | ||
85a9b4ed | 477 | A Scheme object of type @code{SCM} that does not fulfill the |
505392ae NJ |
478 | @code{SCM_IMP} predicate holds an encoded reference to a heap cell. |
479 | This reference can be decoded to a C pointer to a heap cell using the | |
480 | @code{SCM2PTR} macro. The encoding of a pointer to a heap cell into a | |
481 | @code{SCM} value is done using the @code{PTR2SCM} macro. | |
482 | ||
483 | @c (FIXME:: this name should be changed) | |
eff313ed | 484 | @deftypefn Macro {scm_t_cell *} SCM2PTR (SCM @var{x}) |
505392ae NJ |
485 | Extract and return the heap cell pointer from a non-immediate @code{SCM} |
486 | object @var{x}. | |
487 | @end deftypefn | |
488 | ||
489 | @c (FIXME:: this name should be changed) | |
228a24ef | 490 | @deftypefn Macro SCM PTR2SCM (scm_t_cell * @var{x}) |
505392ae NJ |
491 | Return a @code{SCM} value that encodes a reference to the heap cell |
492 | pointer @var{x}. | |
493 | @end deftypefn | |
494 | ||
495 | Note that it is also possible to transform a non-immediate @code{SCM} | |
9d5315b6 | 496 | value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable. |
505392ae | 497 | However, the result of @code{SCM_UNPACK} may not be used as a pointer to |
228a24ef | 498 | a @code{scm_t_cell}: only @code{SCM2PTR} is guaranteed to transform a |
505392ae NJ |
499 | @code{SCM} object into a valid pointer to a heap cell. Also, it is not |
500 | allowed to apply @code{PTR2SCM} to anything that is not a valid pointer | |
501 | to a heap cell. | |
502 | ||
503 | @noindent | |
504 | Summary: | |
505 | @itemize @bullet | |
506 | @item | |
507 | Only use @code{SCM2PTR} on @code{SCM} values for which @code{SCM_IMP} is | |
508 | false! | |
509 | @item | |
228a24ef | 510 | Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use @code{SCM2PTR |
505392ae NJ |
511 | (@var{x})} instead! |
512 | @item | |
513 | Don't use @code{PTR2SCM} for anything but a cell pointer! | |
514 | @end itemize | |
515 | ||
9d5315b6 MV |
516 | @node Allocating Cells |
517 | @subsubsection Allocating Cells | |
518 | ||
519 | Guile provides both ordinary cells with two slots, and double cells | |
520 | with four slots. The following two function are the most primitive | |
521 | way to allocate such cells. | |
522 | ||
523 | If the caller intends to use it as a header for some other type, she | |
524 | must pass an appropriate magic value in @var{word_0}, to mark it as a | |
525 | member of that type, and pass whatever value as @var{word_1}, etc that | |
526 | the type expects. You should generally not need these functions, | |
527 | unless you are implementing a new datatype, and thoroughly understand | |
528 | the code in @code{<libguile/tags.h>}. | |
529 | ||
530 | If you just want to allocate pairs, use @code{scm_cons}. | |
531 | ||
228a24ef | 532 | @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1) |
9d5315b6 MV |
533 | Allocate a new cell, initialize the two slots with @var{word_0} and |
534 | @var{word_1}, and return it. | |
535 | ||
536 | Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}. | |
537 | If you want to pass a @code{SCM} object, you need to use | |
538 | @code{SCM_UNPACK}. | |
539 | @end deftypefn | |
540 | ||
228a24ef DH |
541 | @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3) |
542 | Like @code{scm_cell}, but allocates a double cell with four | |
9d5315b6 MV |
543 | slots. |
544 | @end deftypefn | |
505392ae NJ |
545 | |
546 | @node Heap Cell Type Information | |
547 | @subsubsection Heap Cell Type Information | |
548 | ||
549 | Heap cells contain a number of entries, each of which is either a scheme | |
9d5315b6 | 550 | object of type @code{SCM} or a raw C value of type @code{scm_t_bits}. |
505392ae NJ |
551 | Which of the cell entries contain Scheme objects and which contain raw C |
552 | values is determined by the first entry of the cell, which holds the | |
553 | cell type information. | |
554 | ||
9d5315b6 | 555 | @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x}) |
505392ae NJ |
556 | For a non-immediate Scheme object @var{x}, deliver the content of the |
557 | first entry of the heap cell referenced by @var{x}. This value holds | |
558 | the information about the cell type. | |
559 | @end deftypefn | |
560 | ||
9d5315b6 | 561 | @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t}) |
505392ae NJ |
562 | For a non-immediate Scheme object @var{x}, write the value @var{t} into |
563 | the first entry of the heap cell referenced by @var{x}. The value | |
564 | @var{t} must hold a valid cell type. | |
565 | @end deftypefn | |
566 | ||
567 | ||
568 | @node Accessing Cell Entries | |
569 | @subsubsection Accessing Cell Entries | |
570 | ||
571 | For a non-immediate Scheme object @var{x}, the object type can be | |
572 | determined by reading the cell type entry using the @code{SCM_CELL_TYPE} | |
573 | macro. For each different type of cell it is known which cell entries | |
574 | hold Scheme objects and which cell entries hold raw C data. To access | |
575 | the different cell entries appropriately, the following macros are | |
576 | provided. | |
577 | ||
9d5315b6 | 578 | @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n}) |
505392ae NJ |
579 | Deliver the cell entry @var{n} of the heap cell referenced by the |
580 | non-immediate Scheme object @var{x} as raw data. It is illegal, to | |
581 | access cell entries that hold Scheme objects by using these macros. For | |
582 | convenience, the following macros are also provided. | |
230712c9 | 583 | @itemize @bullet |
505392ae NJ |
584 | @item |
585 | SCM_CELL_WORD_0 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 0) | |
586 | @item | |
587 | SCM_CELL_WORD_1 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 1) | |
588 | @item | |
589 | @dots{} | |
590 | @item | |
591 | SCM_CELL_WORD_@var{n} (@var{x}) @result{} SCM_CELL_WORD (@var{x}, @var{n}) | |
592 | @end itemize | |
593 | @end deftypefn | |
594 | ||
595 | @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}) | |
596 | Deliver the cell entry @var{n} of the heap cell referenced by the | |
597 | non-immediate Scheme object @var{x} as a Scheme object. It is illegal, | |
598 | to access cell entries that do not hold Scheme objects by using these | |
599 | macros. For convenience, the following macros are also provided. | |
230712c9 | 600 | @itemize @bullet |
505392ae NJ |
601 | @item |
602 | SCM_CELL_OBJECT_0 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 0) | |
603 | @item | |
604 | SCM_CELL_OBJECT_1 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 1) | |
605 | @item | |
606 | @dots{} | |
607 | @item | |
608 | SCM_CELL_OBJECT_@var{n} (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, | |
609 | @var{n}) | |
610 | @end itemize | |
611 | @end deftypefn | |
612 | ||
9d5315b6 | 613 | @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w}) |
505392ae NJ |
614 | Write the raw C value @var{w} into entry number @var{n} of the heap cell |
615 | referenced by the non-immediate Scheme value @var{x}. Values that are | |
616 | written into cells this way may only be read from the cells using the | |
617 | @code{SCM_CELL_WORD} macros or, in case cell entry 0 is written, using | |
618 | the @code{SCM_CELL_TYPE} macro. For the special case of cell entry 0 it | |
619 | has to be made sure that @var{w} contains a cell type information which | |
620 | does not describe a Scheme object. For convenience, the following | |
621 | macros are also provided. | |
230712c9 | 622 | @itemize @bullet |
505392ae NJ |
623 | @item |
624 | SCM_SET_CELL_WORD_0 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
625 | (@var{x}, 0, @var{w}) | |
626 | @item | |
627 | SCM_SET_CELL_WORD_1 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
628 | (@var{x}, 1, @var{w}) | |
629 | @item | |
630 | @dots{} | |
631 | @item | |
632 | SCM_SET_CELL_WORD_@var{n} (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
633 | (@var{x}, @var{n}, @var{w}) | |
634 | @end itemize | |
635 | @end deftypefn | |
636 | ||
637 | @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o}) | |
638 | Write the Scheme object @var{o} into entry number @var{n} of the heap | |
639 | cell referenced by the non-immediate Scheme value @var{x}. Values that | |
640 | are written into cells this way may only be read from the cells using | |
641 | the @code{SCM_CELL_OBJECT} macros or, in case cell entry 0 is written, | |
642 | using the @code{SCM_CELL_TYPE} macro. For the special case of cell | |
643 | entry 0 the writing of a Scheme object into this cell is only allowed | |
644 | if the cell forms a Scheme pair. For convenience, the following macros | |
645 | are also provided. | |
230712c9 | 646 | @itemize @bullet |
505392ae NJ |
647 | @item |
648 | SCM_SET_CELL_OBJECT_0 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
649 | (@var{x}, 0, @var{o}) | |
650 | @item | |
651 | SCM_SET_CELL_OBJECT_1 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
652 | (@var{x}, 1, @var{o}) | |
653 | @item | |
654 | @dots{} | |
655 | @item | |
656 | SCM_SET_CELL_OBJECT_@var{n} (@var{x}, @var{o}) @result{} | |
657 | SCM_SET_CELL_OBJECT (@var{x}, @var{n}, @var{o}) | |
658 | @end itemize | |
659 | @end deftypefn | |
660 | ||
661 | @noindent | |
662 | Summary: | |
663 | @itemize @bullet | |
664 | @item | |
665 | For a non-immediate Scheme object @var{x} of unknown type, get the type | |
666 | information by using @code{SCM_CELL_TYPE (@var{x})}. | |
667 | @item | |
668 | As soon as the cell type information is available, only use the | |
669 | appropriate access methods to read and write data to the different cell | |
670 | entries. | |
671 | @end itemize | |
672 | ||
673 | ||
01adf598 KR |
674 | @c Local Variables: |
675 | @c TeX-master: "guile.texi" | |
676 | @c End: |