Commit | Line | Data |
---|---|---|
2da09c3f MV |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004 | |
4 | @c Free Software Foundation, Inc. | |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
38a93523 NJ |
7 | @node Data Representation in Scheme |
8 | @section Data Representation in Scheme | |
9 | ||
10 | Scheme is a latently-typed language; this means that the system cannot, | |
11 | in general, determine the type of a given expression at compile time. | |
12 | Types only become apparent at run time. Variables do not have fixed | |
13 | types; a variable may hold a pair at one point, an integer at the next, | |
14 | and a thousand-element vector later. Instead, values, not variables, | |
15 | have fixed types. | |
16 | ||
17 | In order to implement standard Scheme functions like @code{pair?} and | |
18 | @code{string?} and provide garbage collection, the representation of | |
19 | every value must contain enough information to accurately determine its | |
20 | type at run time. Often, Scheme systems also use this information to | |
21 | determine whether a program has attempted to apply an operation to an | |
22 | inappropriately typed value (such as taking the @code{car} of a string). | |
23 | ||
24 | Because variables, pairs, and vectors may hold values of any type, | |
25 | Scheme implementations use a uniform representation for values --- a | |
26 | single type large enough to hold either a complete value or a pointer | |
27 | to a complete value, along with the necessary typing information. | |
28 | ||
29 | The following sections will present a simple typing system, and then | |
30 | make some refinements to correct its major weaknesses. However, this is | |
31 | not a description of the system Guile actually uses. It is only an | |
32 | illustration of the issues Guile's system must address. We provide all | |
8680d53b AW |
33 | the information one needs to work with Guile's data in @ref{The |
34 | Libguile Runtime Environment}. | |
38a93523 NJ |
35 | |
36 | ||
37 | @menu | |
38 | * A Simple Representation:: | |
39 | * Faster Integers:: | |
40 | * Cheaper Pairs:: | |
41 | * Guile Is Hairier:: | |
42 | @end menu | |
43 | ||
44 | @node A Simple Representation | |
45 | @subsection A Simple Representation | |
46 | ||
47 | The simplest way to meet the above requirements in C would be to | |
48 | represent each value as a pointer to a structure containing a type | |
49 | indicator, followed by a union carrying the real value. Assuming that | |
50 | @code{SCM} is the name of our universal type, we can write: | |
51 | ||
52 | @example | |
53 | enum type @{ integer, pair, string, vector, ... @}; | |
54 | ||
55 | typedef struct value *SCM; | |
56 | ||
57 | struct value @{ | |
58 | enum type type; | |
59 | union @{ | |
60 | int integer; | |
61 | struct @{ SCM car, cdr; @} pair; | |
62 | struct @{ int length; char *elts; @} string; | |
63 | struct @{ int length; SCM *elts; @} vector; | |
64 | ... | |
65 | @} value; | |
66 | @}; | |
67 | @end example | |
68 | with the ellipses replaced with code for the remaining Scheme types. | |
69 | ||
70 | This representation is sufficient to implement all of Scheme's | |
71 | semantics. If @var{x} is an @code{SCM} value: | |
72 | @itemize @bullet | |
73 | @item | |
74 | To test if @var{x} is an integer, we can write @code{@var{x}->type == integer}. | |
75 | @item | |
76 | To find its value, we can write @code{@var{x}->value.integer}. | |
77 | @item | |
78 | To test if @var{x} is a vector, we can write @code{@var{x}->type == vector}. | |
79 | @item | |
80 | If we know @var{x} is a vector, we can write | |
81 | @code{@var{x}->value.vector.elts[0]} to refer to its first element. | |
82 | @item | |
83 | If we know @var{x} is a pair, we can write | |
84 | @code{@var{x}->value.pair.car} to extract its car. | |
85 | @end itemize | |
86 | ||
87 | ||
88 | @node Faster Integers | |
89 | @subsection Faster Integers | |
90 | ||
91 | Unfortunately, the above representation has a serious disadvantage. In | |
92 | order to return an integer, an expression must allocate a @code{struct | |
93 | value}, initialize it to represent that integer, and return a pointer to | |
94 | it. Furthermore, fetching an integer's value requires a memory | |
95 | reference, which is much slower than a register reference on most | |
96 | processors. Since integers are extremely common, this representation is | |
97 | too costly, in both time and space. Integers should be very cheap to | |
98 | create and manipulate. | |
99 | ||
100 | One possible solution comes from the observation that, on many | |
101 | architectures, structures must be aligned on a four-byte boundary. | |
102 | (Whether or not the machine actually requires it, we can write our own | |
103 | allocator for @code{struct value} objects that assures this is true.) | |
104 | In this case, the lower two bits of the structure's address are known to | |
105 | be zero. | |
106 | ||
107 | This gives us the room we need to provide an improved representation | |
108 | for integers. We make the following rules: | |
109 | @itemize @bullet | |
110 | @item | |
111 | If the lower two bits of an @code{SCM} value are zero, then the SCM | |
112 | value is a pointer to a @code{struct value}, and everything proceeds as | |
113 | before. | |
114 | @item | |
115 | Otherwise, the @code{SCM} value represents an integer, whose value | |
116 | appears in its upper bits. | |
117 | @end itemize | |
118 | ||
119 | Here is C code implementing this convention: | |
120 | @example | |
121 | enum type @{ pair, string, vector, ... @}; | |
122 | ||
123 | typedef struct value *SCM; | |
124 | ||
125 | struct value @{ | |
126 | enum type type; | |
127 | union @{ | |
128 | struct @{ SCM car, cdr; @} pair; | |
129 | struct @{ int length; char *elts; @} string; | |
130 | struct @{ int length; SCM *elts; @} vector; | |
131 | ... | |
132 | @} value; | |
133 | @}; | |
134 | ||
135 | #define POINTER_P(x) (((int) (x) & 3) == 0) | |
136 | #define INTEGER_P(x) (! POINTER_P (x)) | |
137 | ||
138 | #define GET_INTEGER(x) ((int) (x) >> 2) | |
139 | #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1)) | |
140 | @end example | |
141 | ||
142 | Notice that @code{integer} no longer appears as an element of @code{enum | |
143 | type}, and the union has lost its @code{integer} member. Instead, we | |
144 | use the @code{POINTER_P} and @code{INTEGER_P} macros to make a coarse | |
145 | classification of values into integers and non-integers, and do further | |
146 | type testing as before. | |
147 | ||
148 | Here's how we would answer the questions posed above (again, assume | |
149 | @var{x} is an @code{SCM} value): | |
150 | @itemize @bullet | |
151 | @item | |
152 | To test if @var{x} is an integer, we can write @code{INTEGER_P (@var{x})}. | |
153 | @item | |
154 | To find its value, we can write @code{GET_INTEGER (@var{x})}. | |
155 | @item | |
156 | To test if @var{x} is a vector, we can write: | |
157 | @example | |
158 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
159 | @end example | |
160 | Given the new representation, we must make sure @var{x} is truly a | |
161 | pointer before we dereference it to determine its complete type. | |
162 | @item | |
163 | If we know @var{x} is a vector, we can write | |
164 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
165 | before. | |
166 | @item | |
167 | If we know @var{x} is a pair, we can write | |
168 | @code{@var{x}->value.pair.car} to extract its car, just as before. | |
169 | @end itemize | |
170 | ||
171 | This representation allows us to operate more efficiently on integers | |
172 | than the first. For example, if @var{x} and @var{y} are known to be | |
173 | integers, we can compute their sum as follows: | |
174 | @example | |
175 | MAKE_INTEGER (GET_INTEGER (@var{x}) + GET_INTEGER (@var{y})) | |
176 | @end example | |
177 | Now, integer math requires no allocation or memory references. Most | |
178 | real Scheme systems actually use an even more efficient representation, | |
179 | but this essay isn't about bit-twiddling. (Hint: what if pointers had | |
180 | @code{01} in their least significant bits, and integers had @code{00}?) | |
181 | ||
182 | ||
183 | @node Cheaper Pairs | |
184 | @subsection Cheaper Pairs | |
185 | ||
186 | However, there is yet another issue to confront. Most Scheme heaps | |
187 | contain more pairs than any other type of object; Jonathan Rees says | |
188 | that pairs occupy 45% of the heap in his Scheme implementation, Scheme | |
189 | 48. However, our representation above spends three @code{SCM}-sized | |
190 | words per pair --- one for the type, and two for the @sc{car} and | |
191 | @sc{cdr}. Is there any way to represent pairs using only two words? | |
192 | ||
193 | Let us refine the convention we established earlier. Let us assert | |
194 | that: | |
195 | @itemize @bullet | |
196 | @item | |
197 | If the bottom two bits of an @code{SCM} value are @code{#b00}, then | |
198 | it is a pointer, as before. | |
199 | @item | |
200 | If the bottom two bits are @code{#b01}, then the upper bits are an | |
201 | integer. This is a bit more restrictive than before. | |
202 | @item | |
203 | If the bottom two bits are @code{#b10}, then the value, with the bottom | |
204 | two bits masked out, is the address of a pair. | |
205 | @end itemize | |
206 | ||
207 | Here is the new C code: | |
208 | @example | |
209 | enum type @{ string, vector, ... @}; | |
210 | ||
211 | typedef struct value *SCM; | |
212 | ||
213 | struct value @{ | |
214 | enum type type; | |
215 | union @{ | |
216 | struct @{ int length; char *elts; @} string; | |
217 | struct @{ int length; SCM *elts; @} vector; | |
218 | ... | |
219 | @} value; | |
220 | @}; | |
221 | ||
222 | struct pair @{ | |
223 | SCM car, cdr; | |
224 | @}; | |
225 | ||
226 | #define POINTER_P(x) (((int) (x) & 3) == 0) | |
227 | ||
228 | #define INTEGER_P(x) (((int) (x) & 3) == 1) | |
229 | #define GET_INTEGER(x) ((int) (x) >> 2) | |
230 | #define MAKE_INTEGER(x) ((SCM) (((x) << 2) | 1)) | |
231 | ||
232 | #define PAIR_P(x) (((int) (x) & 3) == 2) | |
233 | #define GET_PAIR(x) ((struct pair *) ((int) (x) & ~3)) | |
234 | @end example | |
235 | ||
236 | Notice that @code{enum type} and @code{struct value} now only contain | |
237 | provisions for vectors and strings; both integers and pairs have become | |
238 | special cases. The code above also assumes that an @code{int} is large | |
239 | enough to hold a pointer, which isn't generally true. | |
240 | ||
241 | ||
242 | Our list of examples is now as follows: | |
243 | @itemize @bullet | |
244 | @item | |
245 | To test if @var{x} is an integer, we can write @code{INTEGER_P | |
246 | (@var{x})}; this is as before. | |
247 | @item | |
248 | To find its value, we can write @code{GET_INTEGER (@var{x})}, as | |
249 | before. | |
250 | @item | |
251 | To test if @var{x} is a vector, we can write: | |
252 | @example | |
253 | @code{POINTER_P (@var{x}) && @var{x}->type == vector} | |
254 | @end example | |
255 | We must still make sure that @var{x} is a pointer to a @code{struct | |
256 | value} before dereferencing it to find its type. | |
257 | @item | |
258 | If we know @var{x} is a vector, we can write | |
259 | @code{@var{x}->value.vector.elts[0]} to refer to its first element, as | |
260 | before. | |
261 | @item | |
262 | We can write @code{PAIR_P (@var{x})} to determine if @var{x} is a | |
263 | pair, and then write @code{GET_PAIR (@var{x})->car} to refer to its | |
264 | car. | |
265 | @end itemize | |
266 | ||
267 | This change in representation reduces our heap size by 15%. It also | |
268 | makes it cheaper to decide if a value is a pair, because no memory | |
269 | references are necessary; it suffices to check the bottom two bits of | |
270 | the @code{SCM} value. This may be significant when traversing lists, a | |
271 | common activity in a Scheme system. | |
272 | ||
85a9b4ed | 273 | Again, most real Scheme systems use a slightly different implementation; |
38a93523 NJ |
274 | for example, if GET_PAIR subtracts off the low bits of @code{x}, instead |
275 | of masking them off, the optimizer will often be able to combine that | |
276 | subtraction with the addition of the offset of the structure member we | |
277 | are referencing, making a modified pointer as fast to use as an | |
278 | unmodified pointer. | |
279 | ||
280 | ||
281 | @node Guile Is Hairier | |
282 | @subsection Guile Is Hairier | |
283 | ||
284 | We originally started with a very simple typing system --- each object | |
285 | has a field that indicates its type. Then, for the sake of efficiency | |
286 | in both time and space, we moved some of the typing information directly | |
287 | into the @code{SCM} value, and left the rest in the @code{struct value}. | |
288 | Guile itself employs a more complex hierarchy, storing finer and finer | |
289 | gradations of type information in different places, depending on the | |
290 | object's coarser type. | |
291 | ||
292 | In the author's opinion, Guile could be simplified greatly without | |
293 | significant loss of efficiency, but the simplified system would still be | |
294 | more complex than what we've presented above. | |
295 | ||
296 | ||
8680d53b AW |
297 | @node The Libguile Runtime Environment |
298 | @section The Libguile Runtime Environment | |
38a93523 NJ |
299 | |
300 | Here we present the specifics of how Guile represents its data. We | |
301 | don't go into complete detail; an exhaustive description of Guile's | |
302 | system would be boring, and we do not wish to encourage people to write | |
303 | code which depends on its details anyway. We do, however, present | |
8680d53b AW |
304 | everything one need know to use Guile's data. It is assumed that the |
305 | reader understands the concepts laid out in @ref{Data Representation | |
306 | in Scheme}. | |
307 | ||
308 | FIXME: much of this is outdated as of 1.8, we don't provide many of | |
309 | these macros any more. Also here we're missing sections about the | |
310 | evaluator implementation, which is interesting, and notes about tail | |
311 | recursion between scheme and c. | |
38a93523 NJ |
312 | |
313 | @menu | |
314 | * General Rules:: | |
315 | * Conservative GC:: | |
abaec75d | 316 | * Immediates vs Non-immediates:: |
38a93523 NJ |
317 | * Immediate Datatypes:: |
318 | * Non-immediate Datatypes:: | |
319 | * Signalling Type Errors:: | |
505392ae | 320 | * Unpacking the SCM type:: |
38a93523 NJ |
321 | @end menu |
322 | ||
323 | @node General Rules | |
324 | @subsection General Rules | |
325 | ||
326 | Any code which operates on Guile datatypes must @code{#include} the | |
327 | header file @code{<libguile.h>}. This file contains a definition for | |
328 | the @code{SCM} typedef (Guile's universal type, as in the examples | |
329 | above), and definitions and declarations for a host of macros and | |
330 | functions that operate on @code{SCM} values. | |
331 | ||
332 | All identifiers declared by @code{<libguile.h>} begin with @code{scm_} | |
333 | or @code{SCM_}. | |
334 | ||
335 | @c [[I wish this were true, but I don't think it is at the moment. -JimB]] | |
336 | @c Macros do not evaluate their arguments more than once, unless documented | |
337 | @c to do so. | |
338 | ||
339 | The functions described here generally check the types of their | |
340 | @code{SCM} arguments, and signal an error if their arguments are of an | |
341 | inappropriate type. Macros generally do not, unless that is their | |
342 | specified purpose. You must verify their argument types beforehand, as | |
343 | necessary. | |
344 | ||
345 | Macros and functions that return a boolean value have names ending in | |
346 | @code{P} or @code{_p} (for ``predicate''). Those that return a negated | |
347 | boolean value have names starting with @code{SCM_N}. For example, | |
348 | @code{SCM_IMP (@var{x})} is a predicate which returns non-zero iff | |
349 | @var{x} is an immediate value (an @code{IM}). @code{SCM_NCONSP | |
350 | (@var{x})} is a predicate which returns non-zero iff @var{x} is | |
351 | @emph{not} a pair object (a @code{CONS}). | |
352 | ||
353 | ||
354 | @node Conservative GC | |
355 | @subsection Conservative Garbage Collection | |
356 | ||
357 | Aside from the latent typing, the major source of constraints on a | |
358 | Scheme implementation's data representation is the garbage collector. | |
359 | The collector must be able to traverse every live object in the heap, to | |
360 | determine which objects are not live. | |
361 | ||
362 | There are many ways to implement this, but Guile uses an algorithm | |
363 | called @dfn{mark and sweep}. The collector scans the system's global | |
364 | variables and the local variables on the stack to determine which | |
365 | objects are immediately accessible by the C code. It then scans those | |
366 | objects to find the objects they point to, @i{et cetera}. The collector | |
367 | sets a @dfn{mark bit} on each object it finds, so each object is | |
368 | traversed only once. This process is called @dfn{tracing}. | |
369 | ||
370 | When the collector can find no unmarked objects pointed to by marked | |
371 | objects, it assumes that any objects that are still unmarked will never | |
372 | be used by the program (since there is no path of dereferences from any | |
373 | global or local variable that reaches them) and deallocates them. | |
374 | ||
375 | In the above paragraphs, we did not specify how the garbage collector | |
376 | finds the global and local variables; as usual, there are many different | |
377 | approaches. Frequently, the programmer must maintain a list of pointers | |
378 | to all global variables that refer to the heap, and another list | |
379 | (adjusted upon entry to and exit from each function) of local variables, | |
380 | for the collector's benefit. | |
381 | ||
382 | The list of global variables is usually not too difficult to maintain, | |
383 | since global variables are relatively rare. However, an explicitly | |
384 | maintained list of local variables (in the author's personal experience) | |
385 | is a nightmare to maintain. Thus, Guile uses a technique called | |
386 | @dfn{conservative garbage collection}, to make the local variable list | |
387 | unnecessary. | |
388 | ||
389 | The trick to conservative collection is to treat the stack as an | |
390 | ordinary range of memory, and assume that @emph{every} word on the stack | |
391 | is a pointer into the heap. Thus, the collector marks all objects whose | |
392 | addresses appear anywhere in the stack, without knowing for sure how | |
393 | that word is meant to be interpreted. | |
394 | ||
395 | Obviously, such a system will occasionally retain objects that are | |
396 | actually garbage, and should be freed. In practice, this is not a | |
397 | problem. The alternative, an explicitly maintained list of local | |
398 | variable addresses, is effectively much less reliable, due to programmer | |
399 | error. | |
400 | ||
401 | To accommodate this technique, data must be represented so that the | |
402 | collector can accurately determine whether a given stack word is a | |
403 | pointer or not. Guile does this as follows: | |
38a93523 | 404 | |
505392ae | 405 | @itemize @bullet |
38a93523 NJ |
406 | @item |
407 | Every heap object has a two-word header, called a @dfn{cell}. Some | |
408 | objects, like pairs, fit entirely in a cell's two words; others may | |
409 | store pointers to additional memory in either of the words. For | |
410 | example, strings and vectors store their length in the first word, and a | |
411 | pointer to their elements in the second. | |
412 | ||
413 | @item | |
414 | Guile allocates whole arrays of cells at a time, called @dfn{heap | |
415 | segments}. These segments are always allocated so that the cells they | |
416 | contain fall on eight-byte boundaries, or whatever is appropriate for | |
417 | the machine's word size. Guile keeps all cells in a heap segment | |
418 | initialized, whether or not they are currently in use. | |
419 | ||
420 | @item | |
421 | Guile maintains a sorted table of heap segments. | |
38a93523 NJ |
422 | @end itemize |
423 | ||
424 | Thus, given any random word @var{w} fetched from the stack, Guile's | |
425 | garbage collector can consult the table to see if @var{w} falls within a | |
426 | known heap segment, and check @var{w}'s alignment. If both tests pass, | |
427 | the collector knows that @var{w} is a valid pointer to a cell, | |
428 | intentional or not, and proceeds to trace the cell. | |
429 | ||
430 | Note that heap segments do not contain all the data Guile uses; cells | |
431 | for objects like vectors and strings contain pointers to other memory | |
432 | areas. However, since those pointers are internal, and not shared among | |
433 | many pieces of code, it is enough for the collector to find the cell, | |
434 | and then use the cell's type to find more pointers to trace. | |
435 | ||
436 | ||
abaec75d NJ |
437 | @node Immediates vs Non-immediates |
438 | @subsection Immediates vs Non-immediates | |
38a93523 NJ |
439 | |
440 | Guile classifies Scheme objects into two kinds: those that fit entirely | |
441 | within an @code{SCM}, and those that require heap storage. | |
442 | ||
443 | The former class are called @dfn{immediates}. The class of immediates | |
444 | includes small integers, characters, boolean values, the empty list, the | |
445 | mysterious end-of-file object, and some others. | |
446 | ||
85a9b4ed | 447 | The remaining types are called, not surprisingly, @dfn{non-immediates}. |
38a93523 NJ |
448 | They include pairs, procedures, strings, vectors, and all other data |
449 | types in Guile. | |
450 | ||
451 | @deftypefn Macro int SCM_IMP (SCM @var{x}) | |
452 | Return non-zero iff @var{x} is an immediate object. | |
453 | @end deftypefn | |
454 | ||
455 | @deftypefn Macro int SCM_NIMP (SCM @var{x}) | |
456 | Return non-zero iff @var{x} is a non-immediate object. This is the | |
457 | exact complement of @code{SCM_IMP}, above. | |
38a93523 NJ |
458 | @end deftypefn |
459 | ||
ffda6093 | 460 | Note that for versions of Guile prior to 1.4 it was necessary to use the |
abaec75d NJ |
461 | @code{SCM_NIMP} macro before calling a finer-grained predicate to |
462 | determine @var{x}'s type, such as @code{SCM_CONSP} or | |
ffda6093 NJ |
463 | @code{SCM_VECTORP}. This is no longer required: the definitions of all |
464 | Guile type predicates now include a call to @code{SCM_NIMP} where | |
465 | necessary. | |
abaec75d | 466 | |
38a93523 NJ |
467 | |
468 | @node Immediate Datatypes | |
469 | @subsection Immediate Datatypes | |
470 | ||
471 | The following datatypes are immediate values; that is, they fit entirely | |
472 | within an @code{SCM} value. The @code{SCM_IMP} and @code{SCM_NIMP} | |
473 | macros will distinguish these from non-immediates; see @ref{Immediates | |
abaec75d | 474 | vs Non-immediates} for an explanation of the distinction. |
38a93523 NJ |
475 | |
476 | Note that the type predicates for immediate values work correctly on any | |
477 | @code{SCM} value; you do not need to call @code{SCM_IMP} first, to | |
505392ae | 478 | establish that a value is immediate. |
38a93523 NJ |
479 | |
480 | @menu | |
481 | * Integer Data:: | |
482 | * Character Data:: | |
483 | * Boolean Data:: | |
484 | * Unique Values:: | |
485 | @end menu | |
486 | ||
487 | @node Integer Data | |
488 | @subsubsection Integers | |
489 | ||
490 | Here are functions for operating on small integers, that fit within an | |
491 | @code{SCM}. Such integers are called @dfn{immediate numbers}, or | |
492 | @dfn{INUMs}. In general, INUMs occupy all but two bits of an | |
493 | @code{SCM}. | |
494 | ||
495 | Bignums and floating-point numbers are non-immediate objects, and have | |
496 | their own, separate accessors. The functions here will not work on | |
497 | them. This is not as much of a problem as you might think, however, | |
498 | because the system never constructs bignums that could fit in an INUM, | |
499 | and never uses floating point values for exact integers. | |
500 | ||
501 | @deftypefn Macro int SCM_INUMP (SCM @var{x}) | |
502 | Return non-zero iff @var{x} is a small integer value. | |
503 | @end deftypefn | |
504 | ||
505 | @deftypefn Macro int SCM_NINUMP (SCM @var{x}) | |
506 | The complement of SCM_INUMP. | |
507 | @end deftypefn | |
508 | ||
509 | @deftypefn Macro int SCM_INUM (SCM @var{x}) | |
510 | Return the value of @var{x} as an ordinary, C integer. If @var{x} | |
511 | is not an INUM, the result is undefined. | |
512 | @end deftypefn | |
513 | ||
514 | @deftypefn Macro SCM SCM_MAKINUM (int @var{i}) | |
515 | Given a C integer @var{i}, return its representation as an @code{SCM}. | |
516 | This function does not check for overflow. | |
517 | @end deftypefn | |
518 | ||
519 | ||
520 | @node Character Data | |
521 | @subsubsection Characters | |
522 | ||
523 | Here are functions for operating on characters. | |
524 | ||
525 | @deftypefn Macro int SCM_CHARP (SCM @var{x}) | |
526 | Return non-zero iff @var{x} is a character value. | |
527 | @end deftypefn | |
528 | ||
529 | @deftypefn Macro {unsigned int} SCM_CHAR (SCM @var{x}) | |
530 | Return the value of @code{x} as a C character. If @var{x} is not a | |
531 | Scheme character, the result is undefined. | |
532 | @end deftypefn | |
533 | ||
534 | @deftypefn Macro SCM SCM_MAKE_CHAR (int @var{c}) | |
535 | Given a C character @var{c}, return its representation as a Scheme | |
536 | character value. | |
537 | @end deftypefn | |
538 | ||
539 | ||
540 | @node Boolean Data | |
541 | @subsubsection Booleans | |
542 | ||
3f7e8708 MV |
543 | Booleans are represented as two specific immediate SCM values, |
544 | @code{SCM_BOOL_T} and @code{SCM_BOOL_F}. @xref{Booleans}, for more | |
545 | information. | |
38a93523 NJ |
546 | |
547 | @node Unique Values | |
548 | @subsubsection Unique Values | |
549 | ||
550 | The immediate values that are neither small integers, characters, nor | |
551 | booleans are all unique values --- that is, datatypes with only one | |
552 | instance. | |
553 | ||
554 | @deftypefn Macro SCM SCM_EOL | |
555 | The Scheme empty list object, or ``End Of List'' object, usually written | |
556 | in Scheme as @code{'()}. | |
557 | @end deftypefn | |
558 | ||
559 | @deftypefn Macro SCM SCM_EOF_VAL | |
560 | The Scheme end-of-file value. It has no standard written | |
561 | representation, for obvious reasons. | |
562 | @end deftypefn | |
563 | ||
564 | @deftypefn Macro SCM SCM_UNSPECIFIED | |
565 | The value returned by expressions which the Scheme standard says return | |
566 | an ``unspecified'' value. | |
567 | ||
568 | This is sort of a weirdly literal way to take things, but the standard | |
569 | read-eval-print loop prints nothing when the expression returns this | |
570 | value, so it's not a bad idea to return this when you can't think of | |
571 | anything else helpful. | |
572 | @end deftypefn | |
573 | ||
574 | @deftypefn Macro SCM SCM_UNDEFINED | |
575 | The ``undefined'' value. Its most important property is that is not | |
576 | equal to any valid Scheme value. This is put to various internal uses | |
577 | by C code interacting with Guile. | |
578 | ||
579 | For example, when you write a C function that is callable from Scheme | |
580 | and which takes optional arguments, the interpreter passes | |
581 | @code{SCM_UNDEFINED} for any arguments you did not receive. | |
582 | ||
583 | We also use this to mark unbound variables. | |
584 | @end deftypefn | |
585 | ||
586 | @deftypefn Macro int SCM_UNBNDP (SCM @var{x}) | |
587 | Return true if @var{x} is @code{SCM_UNDEFINED}. Apply this to a | |
588 | symbol's value to see if it has a binding as a global variable. | |
589 | @end deftypefn | |
590 | ||
591 | ||
592 | @node Non-immediate Datatypes | |
593 | @subsection Non-immediate Datatypes | |
594 | ||
595 | A non-immediate datatype is one which lives in the heap, either because | |
596 | it cannot fit entirely within a @code{SCM} word, or because it denotes a | |
cee2ed4f | 597 | specific storage location (in the nomenclature of the Revised^5 Report |
38a93523 NJ |
598 | on Scheme). |
599 | ||
600 | The @code{SCM_IMP} and @code{SCM_NIMP} macros will distinguish these | |
abaec75d | 601 | from immediates; see @ref{Immediates vs Non-immediates}. |
38a93523 NJ |
602 | |
603 | Given a cell, Guile distinguishes between pairs and other non-immediate | |
604 | types by storing special @dfn{tag} values in a non-pair cell's car, that | |
605 | cannot appear in normal pairs. A cell with a non-tag value in its car | |
606 | is an ordinary pair. The type of a cell with a tag in its car depends | |
607 | on the tag; the non-immediate type predicates test this value. If a tag | |
608 | value appears elsewhere (in a vector, for example), the heap may become | |
609 | corrupted. | |
610 | ||
505392ae NJ |
611 | Note how the type information for a non-immediate object is split |
612 | between the @code{SCM} word and the cell that the @code{SCM} word points | |
613 | to. The @code{SCM} word itself only indicates that the object is | |
614 | non-immediate --- in other words stored in a heap cell. The tag stored | |
615 | in the first word of the heap cell indicates more precisely the type of | |
616 | that object. | |
617 | ||
ffda6093 NJ |
618 | The type predicates for non-immediate values work correctly on any |
619 | @code{SCM} value; you do not need to call @code{SCM_NIMP} first, to | |
620 | establish that a value is non-immediate. | |
38a93523 NJ |
621 | |
622 | @menu | |
38a93523 NJ |
623 | * Pair Data:: |
624 | * Vector Data:: | |
625 | * Procedures:: | |
626 | * Closures:: | |
627 | * Subrs:: | |
628 | * Port Data:: | |
629 | @end menu | |
630 | ||
38a93523 NJ |
631 | |
632 | @node Pair Data | |
633 | @subsubsection Pairs | |
634 | ||
635 | Pairs are the essential building block of list structure in Scheme. A | |
636 | pair object has two fields, called the @dfn{car} and the @dfn{cdr}. | |
637 | ||
638 | It is conventional for a pair's @sc{car} to contain an element of a | |
639 | list, and the @sc{cdr} to point to the next pair in the list, or to | |
640 | contain @code{SCM_EOL}, indicating the end of the list. Thus, a set of | |
641 | pairs chained through their @sc{cdr}s constitutes a singly-linked list. | |
642 | Scheme and libguile define many functions which operate on lists | |
643 | constructed in this fashion, so although lists chained through the | |
644 | @sc{car}s of pairs will work fine too, they may be less convenient to | |
645 | manipulate, and receive less support from the community. | |
646 | ||
647 | Guile implements pairs by mapping the @sc{car} and @sc{cdr} of a pair | |
648 | directly into the two words of the cell. | |
649 | ||
650 | ||
651 | @deftypefn Macro int SCM_CONSP (SCM @var{x}) | |
652 | Return non-zero iff @var{x} is a Scheme pair object. | |
38a93523 NJ |
653 | @end deftypefn |
654 | ||
655 | @deftypefn Macro int SCM_NCONSP (SCM @var{x}) | |
656 | The complement of SCM_CONSP. | |
657 | @end deftypefn | |
658 | ||
38a93523 NJ |
659 | @deftypefun SCM scm_cons (SCM @var{car}, SCM @var{cdr}) |
660 | Allocate (``CONStruct'') a new pair, with @var{car} and @var{cdr} as its | |
661 | contents. | |
662 | @end deftypefun | |
663 | ||
85a9b4ed | 664 | The macros below perform no type checking. The results are undefined if |
38a93523 NJ |
665 | @var{cell} is an immediate. However, since all non-immediate Guile |
666 | objects are constructed from cells, and these macros simply return the | |
667 | first element of a cell, they actually can be useful on datatypes other | |
668 | than pairs. (Of course, it is not very modular to use them outside of | |
669 | the code which implements that datatype.) | |
670 | ||
671 | @deftypefn Macro SCM SCM_CAR (SCM @var{cell}) | |
672 | Return the @sc{car}, or first field, of @var{cell}. | |
673 | @end deftypefn | |
674 | ||
675 | @deftypefn Macro SCM SCM_CDR (SCM @var{cell}) | |
676 | Return the @sc{cdr}, or second field, of @var{cell}. | |
677 | @end deftypefn | |
678 | ||
679 | @deftypefn Macro void SCM_SETCAR (SCM @var{cell}, SCM @var{x}) | |
680 | Set the @sc{car} of @var{cell} to @var{x}. | |
681 | @end deftypefn | |
682 | ||
683 | @deftypefn Macro void SCM_SETCDR (SCM @var{cell}, SCM @var{x}) | |
684 | Set the @sc{cdr} of @var{cell} to @var{x}. | |
685 | @end deftypefn | |
686 | ||
687 | @deftypefn Macro SCM SCM_CAAR (SCM @var{cell}) | |
688 | @deftypefnx Macro SCM SCM_CADR (SCM @var{cell}) | |
689 | @deftypefnx Macro SCM SCM_CDAR (SCM @var{cell}) @dots{} | |
690 | @deftypefnx Macro SCM SCM_CDDDDR (SCM @var{cell}) | |
691 | Return the @sc{car} of the @sc{car} of @var{cell}, the @sc{car} of the | |
692 | @sc{cdr} of @var{cell}, @i{et cetera}. | |
693 | @end deftypefn | |
694 | ||
695 | ||
696 | @node Vector Data | |
697 | @subsubsection Vectors, Strings, and Symbols | |
698 | ||
699 | Vectors, strings, and symbols have some properties in common. They all | |
700 | have a length, and they all have an array of elements. In the case of a | |
701 | vector, the elements are @code{SCM} values; in the case of a string or | |
702 | symbol, the elements are characters. | |
703 | ||
704 | All these types store their length (along with some tagging bits) in the | |
705 | @sc{car} of their header cell, and store a pointer to the elements in | |
706 | their @sc{cdr}. Thus, the @code{SCM_CAR} and @code{SCM_CDR} macros | |
707 | are (somewhat) meaningful when applied to these datatypes. | |
708 | ||
709 | @deftypefn Macro int SCM_VECTORP (SCM @var{x}) | |
710 | Return non-zero iff @var{x} is a vector. | |
38a93523 NJ |
711 | @end deftypefn |
712 | ||
713 | @deftypefn Macro int SCM_STRINGP (SCM @var{x}) | |
714 | Return non-zero iff @var{x} is a string. | |
38a93523 NJ |
715 | @end deftypefn |
716 | ||
717 | @deftypefn Macro int SCM_SYMBOLP (SCM @var{x}) | |
718 | Return non-zero iff @var{x} is a symbol. | |
38a93523 NJ |
719 | @end deftypefn |
720 | ||
cee2ed4f MG |
721 | @deftypefn Macro int SCM_VECTOR_LENGTH (SCM @var{x}) |
722 | @deftypefnx Macro int SCM_STRING_LENGTH (SCM @var{x}) | |
723 | @deftypefnx Macro int SCM_SYMBOL_LENGTH (SCM @var{x}) | |
724 | Return the length of the object @var{x}. The result is undefined if | |
725 | @var{x} is not a vector, string, or symbol, respectively. | |
38a93523 NJ |
726 | @end deftypefn |
727 | ||
cee2ed4f | 728 | @deftypefn Macro {SCM *} SCM_VECTOR_BASE (SCM @var{x}) |
38a93523 | 729 | Return a pointer to the array of elements of the vector @var{x}. |
505392ae | 730 | The result is undefined if @var{x} is not a vector. |
38a93523 NJ |
731 | @end deftypefn |
732 | ||
cee2ed4f MG |
733 | @deftypefn Macro {char *} SCM_STRING_CHARS (SCM @var{x}) |
734 | @deftypefnx Macro {char *} SCM_SYMBOL_CHARS (SCM @var{x}) | |
735 | Return a pointer to the characters of @var{x}. The result is undefined | |
736 | if @var{x} is not a symbol or string, respectively. | |
38a93523 NJ |
737 | @end deftypefn |
738 | ||
739 | There are also a few magic values stuffed into memory before a symbol's | |
740 | characters, but you don't want to know about those. What cruft! | |
741 | ||
cf4e2dab KR |
742 | Note that @code{SCM_VECTOR_BASE}, @code{SCM_STRING_CHARS} and |
743 | @code{SCM_SYMBOL_CHARS} return pointers to data within the respective | |
744 | object. Care must be taken that the object is not garbage collected | |
745 | while that data is still being accessed. This is the same as for a | |
746 | smob, @xref{Remembering During Operations}. | |
747 | ||
38a93523 NJ |
748 | |
749 | @node Procedures | |
750 | @subsubsection Procedures | |
751 | ||
752 | Guile provides two kinds of procedures: @dfn{closures}, which are the | |
753 | result of evaluating a @code{lambda} expression, and @dfn{subrs}, which | |
754 | are C functions packaged up as Scheme objects, to make them available to | |
755 | Scheme programmers. | |
756 | ||
757 | (There are actually other sorts of procedures: compiled closures, and | |
758 | continuations; see the source code for details about them.) | |
759 | ||
760 | @deftypefun SCM scm_procedure_p (SCM @var{x}) | |
761 | Return @code{SCM_BOOL_T} iff @var{x} is a Scheme procedure object, of | |
762 | any sort. Otherwise, return @code{SCM_BOOL_F}. | |
763 | @end deftypefun | |
764 | ||
765 | ||
766 | @node Closures | |
767 | @subsubsection Closures | |
768 | ||
769 | [FIXME: this needs to be further subbed, but texinfo has no subsubsub] | |
770 | ||
771 | A closure is a procedure object, generated as the value of a | |
772 | @code{lambda} expression in Scheme. The representation of a closure is | |
773 | straightforward --- it contains a pointer to the code of the lambda | |
774 | expression from which it was created, and a pointer to the environment | |
775 | it closes over. | |
776 | ||
777 | In Guile, each closure also has a property list, allowing the system to | |
778 | store information about the closure. I'm not sure what this is used for | |
779 | at the moment --- the debugger, maybe? | |
780 | ||
781 | @deftypefn Macro int SCM_CLOSUREP (SCM @var{x}) | |
505392ae | 782 | Return non-zero iff @var{x} is a closure. |
38a93523 NJ |
783 | @end deftypefn |
784 | ||
785 | @deftypefn Macro SCM SCM_PROCPROPS (SCM @var{x}) | |
786 | Return the property list of the closure @var{x}. The results are | |
787 | undefined if @var{x} is not a closure. | |
788 | @end deftypefn | |
789 | ||
790 | @deftypefn Macro void SCM_SETPROCPROPS (SCM @var{x}, SCM @var{p}) | |
791 | Set the property list of the closure @var{x} to @var{p}. The results | |
792 | are undefined if @var{x} is not a closure. | |
793 | @end deftypefn | |
794 | ||
795 | @deftypefn Macro SCM SCM_CODE (SCM @var{x}) | |
505392ae | 796 | Return the code of the closure @var{x}. The result is undefined if |
38a93523 NJ |
797 | @var{x} is not a closure. |
798 | ||
799 | This function should probably only be used internally by the | |
800 | interpreter, since the representation of the code is intimately | |
801 | connected with the interpreter's implementation. | |
802 | @end deftypefn | |
803 | ||
804 | @deftypefn Macro SCM SCM_ENV (SCM @var{x}) | |
805 | Return the environment enclosed by @var{x}. | |
505392ae | 806 | The result is undefined if @var{x} is not a closure. |
38a93523 NJ |
807 | |
808 | This function should probably only be used internally by the | |
809 | interpreter, since the representation of the environment is intimately | |
810 | connected with the interpreter's implementation. | |
811 | @end deftypefn | |
812 | ||
813 | ||
814 | @node Subrs | |
815 | @subsubsection Subrs | |
816 | ||
817 | [FIXME: this needs to be further subbed, but texinfo has no subsubsub] | |
818 | ||
819 | A subr is a pointer to a C function, packaged up as a Scheme object to | |
820 | make it callable by Scheme code. In addition to the function pointer, | |
821 | the subr also contains a pointer to the name of the function, and | |
85a9b4ed | 822 | information about the number of arguments accepted by the C function, for |
38a93523 NJ |
823 | the sake of error checking. |
824 | ||
825 | There is no single type predicate macro that recognizes subrs, as | |
826 | distinct from other kinds of procedures. The closest thing is | |
827 | @code{scm_procedure_p}; see @ref{Procedures}. | |
828 | ||
829 | @deftypefn Macro {char *} SCM_SNAME (@var{x}) | |
505392ae | 830 | Return the name of the subr @var{x}. The result is undefined if |
38a93523 NJ |
831 | @var{x} is not a subr. |
832 | @end deftypefn | |
833 | ||
bcf009c3 | 834 | @deftypefun SCM scm_c_define_gsubr (char *@var{name}, int @var{req}, int @var{opt}, int @var{rest}, SCM (*@var{function})()) |
38a93523 NJ |
835 | Create a new subr object named @var{name}, based on the C function |
836 | @var{function}, make it visible to Scheme the value of as a global | |
837 | variable named @var{name}, and return the subr object. | |
838 | ||
839 | The subr object accepts @var{req} required arguments, @var{opt} optional | |
840 | arguments, and a @var{rest} argument iff @var{rest} is non-zero. The C | |
841 | function @var{function} should accept @code{@var{req} + @var{opt}} | |
842 | arguments, or @code{@var{req} + @var{opt} + 1} arguments if @code{rest} | |
843 | is non-zero. | |
844 | ||
845 | When a subr object is applied, it must be applied to at least @var{req} | |
846 | arguments, or else Guile signals an error. @var{function} receives the | |
847 | subr's first @var{req} arguments as its first @var{req} arguments. If | |
848 | there are fewer than @var{opt} arguments remaining, then @var{function} | |
849 | receives the value @code{SCM_UNDEFINED} for any missing optional | |
01adf598 KR |
850 | arguments. |
851 | ||
852 | If @var{rst} is non-zero, then any arguments after the first | |
853 | @code{@var{req} + @var{opt}} are packaged up as a list and passed as | |
854 | @var{function}'s last argument. @var{function} must not modify that | |
855 | list. (Because when subr is called through @code{apply} the list is | |
856 | directly from the @code{apply} argument, which the caller will expect | |
857 | to be unchanged.) | |
38a93523 NJ |
858 | |
859 | Note that subrs can actually only accept a predefined set of | |
860 | combinations of required, optional, and rest arguments. For example, a | |
861 | subr can take one required argument, or one required and one optional | |
862 | argument, but a subr can't take one required and two optional arguments. | |
863 | It's bizarre, but that's the way the interpreter was written. If the | |
bcf009c3 NJ |
864 | arguments to @code{scm_c_define_gsubr} do not fit one of the predefined |
865 | patterns, then @code{scm_c_define_gsubr} will return a compiled closure | |
38a93523 NJ |
866 | object instead of a subr object. |
867 | @end deftypefun | |
868 | ||
869 | ||
870 | @node Port Data | |
871 | @subsubsection Ports | |
872 | ||
873 | Haven't written this yet, 'cos I don't understand ports yet. | |
874 | ||
875 | ||
876 | @node Signalling Type Errors | |
877 | @subsection Signalling Type Errors | |
878 | ||
879 | Every function visible at the Scheme level should aggressively check the | |
880 | types of its arguments, to avoid misinterpreting a value, and perhaps | |
881 | causing a segmentation fault. Guile provides some macros to make this | |
882 | easier. | |
883 | ||
813c57db NJ |
884 | @deftypefn Macro void SCM_ASSERT (int @var{test}, SCM @var{obj}, unsigned int @var{position}, const char *@var{subr}) |
885 | If @var{test} is zero, signal a ``wrong type argument'' error, | |
886 | attributed to the subroutine named @var{subr}, operating on the value | |
887 | @var{obj}, which is the @var{position}'th argument of @var{subr}. | |
38a93523 NJ |
888 | @end deftypefn |
889 | ||
890 | @deftypefn Macro int SCM_ARG1 | |
891 | @deftypefnx Macro int SCM_ARG2 | |
892 | @deftypefnx Macro int SCM_ARG3 | |
893 | @deftypefnx Macro int SCM_ARG4 | |
894 | @deftypefnx Macro int SCM_ARG5 | |
813c57db NJ |
895 | @deftypefnx Macro int SCM_ARG6 |
896 | @deftypefnx Macro int SCM_ARG7 | |
897 | One of the above values can be used for @var{position} to indicate the | |
898 | number of the argument of @var{subr} which is being checked. | |
899 | Alternatively, a positive integer number can be used, which allows to | |
900 | check arguments after the seventh. However, for parameter numbers up to | |
901 | seven it is preferable to use @code{SCM_ARGN} instead of the | |
902 | corresponding raw number, since it will make the code easier to | |
903 | understand. | |
38a93523 NJ |
904 | @end deftypefn |
905 | ||
906 | @deftypefn Macro int SCM_ARGn | |
813c57db NJ |
907 | Passing a value of zero or @code{SCM_ARGn} for @var{position} allows to |
908 | leave it unspecified which argument's type is incorrect. Again, | |
909 | @code{SCM_ARGn} should be preferred over a raw zero constant. | |
38a93523 NJ |
910 | @end deftypefn |
911 | ||
912 | ||
505392ae NJ |
913 | @node Unpacking the SCM type |
914 | @subsection Unpacking the SCM Type | |
915 | ||
916 | The previous sections have explained how @code{SCM} values can refer to | |
917 | immediate and non-immediate Scheme objects. For immediate objects, the | |
918 | complete object value is stored in the @code{SCM} word itself, while for | |
919 | non-immediates, the @code{SCM} word contains a pointer to a heap cell, | |
920 | and further information about the object in question is stored in that | |
921 | cell. This section describes how the @code{SCM} type is actually | |
922 | represented and used at the C level. | |
923 | ||
3229f68b MV |
924 | In fact, there are two basic C data types to represent objects in |
925 | Guile: @code{SCM} and @code{scm_t_bits}. | |
505392ae NJ |
926 | |
927 | @menu | |
9d5315b6 | 928 | * Relationship between SCM and scm_t_bits:: |
505392ae NJ |
929 | * Immediate objects:: |
930 | * Non-immediate objects:: | |
9d5315b6 | 931 | * Allocating Cells:: |
505392ae NJ |
932 | * Heap Cell Type Information:: |
933 | * Accessing Cell Entries:: | |
934 | * Basic Rules for Accessing Cell Entries:: | |
935 | @end menu | |
936 | ||
937 | ||
9d5315b6 MV |
938 | @node Relationship between SCM and scm_t_bits |
939 | @subsubsection Relationship between @code{SCM} and @code{scm_t_bits} | |
505392ae NJ |
940 | |
941 | A variable of type @code{SCM} is guaranteed to hold a valid Scheme | |
9d5315b6 | 942 | object. A variable of type @code{scm_t_bits}, on the other hand, may |
505392ae NJ |
943 | hold a representation of a @code{SCM} value as a C integral type, but |
944 | may also hold any C value, even if it does not correspond to a valid | |
945 | Scheme object. | |
946 | ||
947 | For a variable @var{x} of type @code{SCM}, the Scheme object's type | |
948 | information is stored in a form that is not directly usable. To be able | |
949 | to work on the type encoding of the scheme value, the @code{SCM} | |
950 | variable has to be transformed into the corresponding representation as | |
9d5315b6 | 951 | a @code{scm_t_bits} variable @var{y} by using the @code{SCM_UNPACK} |
505392ae | 952 | macro. Once this has been done, the type of the scheme object @var{x} |
9d5315b6 | 953 | can be derived from the content of the bits of the @code{scm_t_bits} |
505392ae NJ |
954 | value @var{y}, in the way illustrated by the example earlier in this |
955 | chapter (@pxref{Cheaper Pairs}). Conversely, a valid bit encoding of a | |
9d5315b6 | 956 | Scheme value as a @code{scm_t_bits} variable can be transformed into the |
505392ae NJ |
957 | corresponding @code{SCM} value using the @code{SCM_PACK} macro. |
958 | ||
505392ae NJ |
959 | @node Immediate objects |
960 | @subsubsection Immediate objects | |
961 | ||
962 | A Scheme object may either be an immediate, i.e. carrying all necessary | |
963 | information by itself, or it may contain a reference to a @dfn{cell} | |
964 | with additional information on the heap. Although in general it should | |
965 | be irrelevant for user code whether an object is an immediate or not, | |
966 | within Guile's own code the distinction is sometimes of importance. | |
967 | Thus, the following low level macro is provided: | |
968 | ||
969 | @deftypefn Macro int SCM_IMP (SCM @var{x}) | |
970 | A Scheme object is an immediate if it fulfills the @code{SCM_IMP} | |
971 | predicate, otherwise it holds an encoded reference to a heap cell. The | |
972 | result of the predicate is delivered as a C style boolean value. User | |
973 | code and code that extends Guile should normally not be required to use | |
974 | this macro. | |
975 | @end deftypefn | |
976 | ||
977 | @noindent | |
978 | Summary: | |
979 | @itemize @bullet | |
980 | @item | |
981 | Given a Scheme object @var{x} of unknown type, check first | |
982 | with @code{SCM_IMP (@var{x})} if it is an immediate object. | |
983 | @item | |
984 | If so, all of the type and value information can be determined from the | |
9d5315b6 | 985 | @code{scm_t_bits} value that is delivered by @code{SCM_UNPACK |
505392ae NJ |
986 | (@var{x})}. |
987 | @end itemize | |
988 | ||
989 | ||
990 | @node Non-immediate objects | |
991 | @subsubsection Non-immediate objects | |
992 | ||
85a9b4ed | 993 | A Scheme object of type @code{SCM} that does not fulfill the |
505392ae NJ |
994 | @code{SCM_IMP} predicate holds an encoded reference to a heap cell. |
995 | This reference can be decoded to a C pointer to a heap cell using the | |
996 | @code{SCM2PTR} macro. The encoding of a pointer to a heap cell into a | |
997 | @code{SCM} value is done using the @code{PTR2SCM} macro. | |
998 | ||
999 | @c (FIXME:: this name should be changed) | |
eff313ed | 1000 | @deftypefn Macro {scm_t_cell *} SCM2PTR (SCM @var{x}) |
505392ae NJ |
1001 | Extract and return the heap cell pointer from a non-immediate @code{SCM} |
1002 | object @var{x}. | |
1003 | @end deftypefn | |
1004 | ||
1005 | @c (FIXME:: this name should be changed) | |
228a24ef | 1006 | @deftypefn Macro SCM PTR2SCM (scm_t_cell * @var{x}) |
505392ae NJ |
1007 | Return a @code{SCM} value that encodes a reference to the heap cell |
1008 | pointer @var{x}. | |
1009 | @end deftypefn | |
1010 | ||
1011 | Note that it is also possible to transform a non-immediate @code{SCM} | |
9d5315b6 | 1012 | value by using @code{SCM_UNPACK} into a @code{scm_t_bits} variable. |
505392ae | 1013 | However, the result of @code{SCM_UNPACK} may not be used as a pointer to |
228a24ef | 1014 | a @code{scm_t_cell}: only @code{SCM2PTR} is guaranteed to transform a |
505392ae NJ |
1015 | @code{SCM} object into a valid pointer to a heap cell. Also, it is not |
1016 | allowed to apply @code{PTR2SCM} to anything that is not a valid pointer | |
1017 | to a heap cell. | |
1018 | ||
1019 | @noindent | |
1020 | Summary: | |
1021 | @itemize @bullet | |
1022 | @item | |
1023 | Only use @code{SCM2PTR} on @code{SCM} values for which @code{SCM_IMP} is | |
1024 | false! | |
1025 | @item | |
228a24ef | 1026 | Don't use @code{(scm_t_cell *) SCM_UNPACK (@var{x})}! Use @code{SCM2PTR |
505392ae NJ |
1027 | (@var{x})} instead! |
1028 | @item | |
1029 | Don't use @code{PTR2SCM} for anything but a cell pointer! | |
1030 | @end itemize | |
1031 | ||
9d5315b6 MV |
1032 | @node Allocating Cells |
1033 | @subsubsection Allocating Cells | |
1034 | ||
1035 | Guile provides both ordinary cells with two slots, and double cells | |
1036 | with four slots. The following two function are the most primitive | |
1037 | way to allocate such cells. | |
1038 | ||
1039 | If the caller intends to use it as a header for some other type, she | |
1040 | must pass an appropriate magic value in @var{word_0}, to mark it as a | |
1041 | member of that type, and pass whatever value as @var{word_1}, etc that | |
1042 | the type expects. You should generally not need these functions, | |
1043 | unless you are implementing a new datatype, and thoroughly understand | |
1044 | the code in @code{<libguile/tags.h>}. | |
1045 | ||
1046 | If you just want to allocate pairs, use @code{scm_cons}. | |
1047 | ||
228a24ef | 1048 | @deftypefn Function SCM scm_cell (scm_t_bits word_0, scm_t_bits word_1) |
9d5315b6 MV |
1049 | Allocate a new cell, initialize the two slots with @var{word_0} and |
1050 | @var{word_1}, and return it. | |
1051 | ||
1052 | Note that @var{word_0} and @var{word_1} are of type @code{scm_t_bits}. | |
1053 | If you want to pass a @code{SCM} object, you need to use | |
1054 | @code{SCM_UNPACK}. | |
1055 | @end deftypefn | |
1056 | ||
228a24ef DH |
1057 | @deftypefn Function SCM scm_double_cell (scm_t_bits word_0, scm_t_bits word_1, scm_t_bits word_2, scm_t_bits word_3) |
1058 | Like @code{scm_cell}, but allocates a double cell with four | |
9d5315b6 MV |
1059 | slots. |
1060 | @end deftypefn | |
505392ae NJ |
1061 | |
1062 | @node Heap Cell Type Information | |
1063 | @subsubsection Heap Cell Type Information | |
1064 | ||
1065 | Heap cells contain a number of entries, each of which is either a scheme | |
9d5315b6 | 1066 | object of type @code{SCM} or a raw C value of type @code{scm_t_bits}. |
505392ae NJ |
1067 | Which of the cell entries contain Scheme objects and which contain raw C |
1068 | values is determined by the first entry of the cell, which holds the | |
1069 | cell type information. | |
1070 | ||
9d5315b6 | 1071 | @deftypefn Macro scm_t_bits SCM_CELL_TYPE (SCM @var{x}) |
505392ae NJ |
1072 | For a non-immediate Scheme object @var{x}, deliver the content of the |
1073 | first entry of the heap cell referenced by @var{x}. This value holds | |
1074 | the information about the cell type. | |
1075 | @end deftypefn | |
1076 | ||
9d5315b6 | 1077 | @deftypefn Macro void SCM_SET_CELL_TYPE (SCM @var{x}, scm_t_bits @var{t}) |
505392ae NJ |
1078 | For a non-immediate Scheme object @var{x}, write the value @var{t} into |
1079 | the first entry of the heap cell referenced by @var{x}. The value | |
1080 | @var{t} must hold a valid cell type. | |
1081 | @end deftypefn | |
1082 | ||
1083 | ||
1084 | @node Accessing Cell Entries | |
1085 | @subsubsection Accessing Cell Entries | |
1086 | ||
1087 | For a non-immediate Scheme object @var{x}, the object type can be | |
1088 | determined by reading the cell type entry using the @code{SCM_CELL_TYPE} | |
1089 | macro. For each different type of cell it is known which cell entries | |
1090 | hold Scheme objects and which cell entries hold raw C data. To access | |
1091 | the different cell entries appropriately, the following macros are | |
1092 | provided. | |
1093 | ||
9d5315b6 | 1094 | @deftypefn Macro scm_t_bits SCM_CELL_WORD (SCM @var{x}, unsigned int @var{n}) |
505392ae NJ |
1095 | Deliver the cell entry @var{n} of the heap cell referenced by the |
1096 | non-immediate Scheme object @var{x} as raw data. It is illegal, to | |
1097 | access cell entries that hold Scheme objects by using these macros. For | |
1098 | convenience, the following macros are also provided. | |
230712c9 | 1099 | @itemize @bullet |
505392ae NJ |
1100 | @item |
1101 | SCM_CELL_WORD_0 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 0) | |
1102 | @item | |
1103 | SCM_CELL_WORD_1 (@var{x}) @result{} SCM_CELL_WORD (@var{x}, 1) | |
1104 | @item | |
1105 | @dots{} | |
1106 | @item | |
1107 | SCM_CELL_WORD_@var{n} (@var{x}) @result{} SCM_CELL_WORD (@var{x}, @var{n}) | |
1108 | @end itemize | |
1109 | @end deftypefn | |
1110 | ||
1111 | @deftypefn Macro SCM SCM_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}) | |
1112 | Deliver the cell entry @var{n} of the heap cell referenced by the | |
1113 | non-immediate Scheme object @var{x} as a Scheme object. It is illegal, | |
1114 | to access cell entries that do not hold Scheme objects by using these | |
1115 | macros. For convenience, the following macros are also provided. | |
230712c9 | 1116 | @itemize @bullet |
505392ae NJ |
1117 | @item |
1118 | SCM_CELL_OBJECT_0 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 0) | |
1119 | @item | |
1120 | SCM_CELL_OBJECT_1 (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, 1) | |
1121 | @item | |
1122 | @dots{} | |
1123 | @item | |
1124 | SCM_CELL_OBJECT_@var{n} (@var{x}) @result{} SCM_CELL_OBJECT (@var{x}, | |
1125 | @var{n}) | |
1126 | @end itemize | |
1127 | @end deftypefn | |
1128 | ||
9d5315b6 | 1129 | @deftypefn Macro void SCM_SET_CELL_WORD (SCM @var{x}, unsigned int @var{n}, scm_t_bits @var{w}) |
505392ae NJ |
1130 | Write the raw C value @var{w} into entry number @var{n} of the heap cell |
1131 | referenced by the non-immediate Scheme value @var{x}. Values that are | |
1132 | written into cells this way may only be read from the cells using the | |
1133 | @code{SCM_CELL_WORD} macros or, in case cell entry 0 is written, using | |
1134 | the @code{SCM_CELL_TYPE} macro. For the special case of cell entry 0 it | |
1135 | has to be made sure that @var{w} contains a cell type information which | |
1136 | does not describe a Scheme object. For convenience, the following | |
1137 | macros are also provided. | |
230712c9 | 1138 | @itemize @bullet |
505392ae NJ |
1139 | @item |
1140 | SCM_SET_CELL_WORD_0 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1141 | (@var{x}, 0, @var{w}) | |
1142 | @item | |
1143 | SCM_SET_CELL_WORD_1 (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1144 | (@var{x}, 1, @var{w}) | |
1145 | @item | |
1146 | @dots{} | |
1147 | @item | |
1148 | SCM_SET_CELL_WORD_@var{n} (@var{x}, @var{w}) @result{} SCM_SET_CELL_WORD | |
1149 | (@var{x}, @var{n}, @var{w}) | |
1150 | @end itemize | |
1151 | @end deftypefn | |
1152 | ||
1153 | @deftypefn Macro void SCM_SET_CELL_OBJECT (SCM @var{x}, unsigned int @var{n}, SCM @var{o}) | |
1154 | Write the Scheme object @var{o} into entry number @var{n} of the heap | |
1155 | cell referenced by the non-immediate Scheme value @var{x}. Values that | |
1156 | are written into cells this way may only be read from the cells using | |
1157 | the @code{SCM_CELL_OBJECT} macros or, in case cell entry 0 is written, | |
1158 | using the @code{SCM_CELL_TYPE} macro. For the special case of cell | |
1159 | entry 0 the writing of a Scheme object into this cell is only allowed | |
1160 | if the cell forms a Scheme pair. For convenience, the following macros | |
1161 | are also provided. | |
230712c9 | 1162 | @itemize @bullet |
505392ae NJ |
1163 | @item |
1164 | SCM_SET_CELL_OBJECT_0 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
1165 | (@var{x}, 0, @var{o}) | |
1166 | @item | |
1167 | SCM_SET_CELL_OBJECT_1 (@var{x}, @var{o}) @result{} SCM_SET_CELL_OBJECT | |
1168 | (@var{x}, 1, @var{o}) | |
1169 | @item | |
1170 | @dots{} | |
1171 | @item | |
1172 | SCM_SET_CELL_OBJECT_@var{n} (@var{x}, @var{o}) @result{} | |
1173 | SCM_SET_CELL_OBJECT (@var{x}, @var{n}, @var{o}) | |
1174 | @end itemize | |
1175 | @end deftypefn | |
1176 | ||
1177 | @noindent | |
1178 | Summary: | |
1179 | @itemize @bullet | |
1180 | @item | |
1181 | For a non-immediate Scheme object @var{x} of unknown type, get the type | |
1182 | information by using @code{SCM_CELL_TYPE (@var{x})}. | |
1183 | @item | |
1184 | As soon as the cell type information is available, only use the | |
1185 | appropriate access methods to read and write data to the different cell | |
1186 | entries. | |
1187 | @end itemize | |
1188 | ||
1189 | ||
1190 | @node Basic Rules for Accessing Cell Entries | |
1191 | @subsubsection Basic Rules for Accessing Cell Entries | |
1192 | ||
1193 | For each cell type it is generally up to the implementation of that type | |
1194 | which of the corresponding cell entries hold Scheme objects and which | |
1195 | hold raw C values. However, there is one basic rule that has to be | |
1196 | followed: Scheme pairs consist of exactly two cell entries, which both | |
1197 | contain Scheme objects. Further, a cell which contains a Scheme object | |
1198 | in it first entry has to be a Scheme pair. In other words, it is not | |
1199 | allowed to store a Scheme object in the first cell entry and a non | |
1200 | Scheme object in the second cell entry. | |
1201 | ||
1202 | @c Fixme:shouldn't this rather be SCM_PAIRP / SCM_PAIR_P ? | |
1203 | @deftypefn Macro int SCM_CONSP (SCM @var{x}) | |
1204 | Determine, whether the Scheme object @var{x} is a Scheme pair, | |
1205 | i.e. whether @var{x} references a heap cell consisting of exactly two | |
1206 | entries, where both entries contain a Scheme object. In this case, both | |
1207 | entries will have to be accessed using the @code{SCM_CELL_OBJECT} | |
c4d0cddd NJ |
1208 | macros. On the contrary, if the @code{SCM_CONSP} predicate is not |
1209 | fulfilled, the first entry of the Scheme cell is guaranteed not to be a | |
1210 | Scheme value and thus the first cell entry must be accessed using the | |
505392ae NJ |
1211 | @code{SCM_CELL_WORD_0} macro. |
1212 | @end deftypefn | |
1213 | ||
1214 | ||
01adf598 KR |
1215 | @c Local Variables: |
1216 | @c TeX-master: "guile.texi" | |
1217 | @c End: |