Commit | Line | Data |
---|---|---|
3229f68b MV |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
b4fddbbe | 3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005 |
3229f68b MV |
4 | @c Free Software Foundation, Inc. |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
7 | @page | |
8 | @node General Libguile Concepts | |
9 | @section General concepts for using libguile | |
10 | ||
b4fddbbe MV |
11 | When you want to embed the Guile Scheme interpreter into your program or |
12 | library, you need to link it against the @file{libguile} library | |
13 | (@pxref{Linking Programs With Guile}). Once you have done this, your C | |
14 | code has access to a number of data types and functions that can be used | |
15 | to invoke the interpreter, or make new functions that you have written | |
16 | in C available to be called from Scheme code, among other things. | |
3229f68b MV |
17 | |
18 | Scheme is different from C in a number of significant ways, and Guile | |
19 | tries to make the advantages of Scheme available to C as well. Thus, in | |
20 | addition to a Scheme interpreter, libguile also offers dynamic types, | |
21 | garbage collection, continuations, arithmetic on arbitrary sized | |
22 | numbers, and other things. | |
23 | ||
24 | The two fundamental concepts are dynamic types and garbage collection. | |
25 | You need to understand how libguile offers them to C programs in order | |
26 | to use the rest of libguile. Also, the more general control flow of | |
27 | Scheme caused by continuations needs to be dealt with. | |
28 | ||
b4fddbbe MV |
29 | Running asynchronous signal handlers and multi-threading is known to C |
30 | code already, but there are of course a few additional rules when using | |
31 | them together with libguile. | |
32 | ||
3229f68b MV |
33 | @menu |
34 | * Dynamic Types:: Dynamic Types. | |
35 | * Garbage Collection:: Garbage Collection. | |
36 | * Control Flow:: Control Flow. | |
b4fddbbe MV |
37 | * Asynchronous Signals:: Asynchronous Signals |
38 | * Multi-Threading:: Multi-Threading | |
3229f68b MV |
39 | @end menu |
40 | ||
41 | @node Dynamic Types | |
42 | @subsection Dynamic Types | |
43 | ||
44 | Scheme is a dynamically-typed language; this means that the system | |
45 | cannot, in general, determine the type of a given expression at compile | |
46 | time. Types only become apparent at run time. Variables do not have | |
47 | fixed types; a variable may hold a pair at one point, an integer at the | |
48 | next, and a thousand-element vector later. Instead, values, not | |
49 | variables, have fixed types. | |
50 | ||
51 | In order to implement standard Scheme functions like @code{pair?} and | |
52 | @code{string?} and provide garbage collection, the representation of | |
53 | every value must contain enough information to accurately determine its | |
54 | type at run time. Often, Scheme systems also use this information to | |
55 | determine whether a program has attempted to apply an operation to an | |
56 | inappropriately typed value (such as taking the @code{car} of a string). | |
57 | ||
58 | Because variables, pairs, and vectors may hold values of any type, | |
59 | Scheme implementations use a uniform representation for values --- a | |
60 | single type large enough to hold either a complete value or a pointer | |
61 | to a complete value, along with the necessary typing information. | |
62 | ||
63 | In Guile, this uniform representation of all Scheme values is the C type | |
64 | @code{SCM}. This is an opaque type and its size is typically equivalent | |
65 | to that of a pointer to @code{void}. Thus, @code{SCM} values can be | |
66 | passed around efficiently and they take up reasonably little storage on | |
67 | their own. | |
68 | ||
69 | The most important rule is: You never access a @code{SCM} value | |
70 | directly; you only pass it to functions or macros defined in libguile. | |
71 | ||
72 | As an obvious example, although a @code{SCM} variable can contain | |
73 | integers, you can of course not compute the sum of two @code{SCM} values | |
74 | by adding them with the C @code{+} operator. You must use the libguile | |
75 | function @code{scm_sum}. | |
76 | ||
77 | Less obvious and therefore more important to keep in mind is that you | |
78 | also cannot directly test @code{SCM} values for trueness. In Scheme, | |
79 | the value @code{#f} is considered false and of course a @code{SCM} | |
80 | variable can represent that value. But there is no guarantee that the | |
81 | @code{SCM} representation of @code{#f} looks false to C code as well. | |
82 | You need to use @code{scm_is_true} or @code{scm_is_false} to test a | |
83 | @code{SCM} value for trueness or falseness, respectively. | |
84 | ||
85 | You also can not directly compare two @code{SCM} values to find out | |
86 | whether they are identical (that is, whether they are @code{eq?} in | |
87 | Scheme terms). You need to use @code{scm_is_eq} for this. | |
88 | ||
89 | The one exception is that you can directly assign a @code{SCM} value to | |
90 | a @code{SCM} variable by using the C @code{=} operator. | |
91 | ||
8c3fa3e5 | 92 | The following (contrived) example shows how to do it right. It |
3229f68b MV |
93 | implements a function of two arguments (@var{a} and @var{flag}) that |
94 | returns @var{a}+1 if @var{flag} is true, else it returns @var{a} | |
95 | unchanged. | |
96 | ||
97 | @example | |
98 | SCM | |
99 | my_incrementing_function (SCM a, SCM flag) | |
100 | @{ | |
101 | SCM result; | |
102 | ||
103 | if (scm_is_true (flag)) | |
104 | result = scm_sum (a, scm_from_int (1)); | |
105 | else | |
106 | result = a; | |
107 | ||
108 | return result; | |
109 | @} | |
110 | @end example | |
111 | ||
112 | Often, you need to convert between @code{SCM} values and approriate C | |
113 | values. For example, we needed to convert the integer @code{1} to its | |
114 | @code{SCM} representation in order to add it to @var{a}. Libguile | |
115 | provides many function to do these conversions, both from C to | |
116 | @code{SCM} and from @code{SCM} to C. | |
117 | ||
118 | The conversion functions follow a common naming pattern: those that make | |
119 | a @code{SCM} value from a C value have names of the form | |
120 | @code{scm_from_@var{type} (@dots{})} and those that convert a @code{SCM} | |
121 | value to a C value use the form @code{scm_to_@var{type} (@dots{})}. | |
122 | ||
123 | However, it is best to avoid converting values when you can. When you | |
124 | must combine C values and @code{SCM} values in a computation, it is | |
125 | often better to convert the C values to @code{SCM} values and do the | |
126 | computation by using libguile functions than to the other way around | |
127 | (converting @code{SCM} to C and doing the computation some other way). | |
128 | ||
129 | As a simple example, consider this version of | |
130 | @code{my_incrementing_function} from above: | |
131 | ||
132 | @example | |
133 | SCM | |
134 | my_other_incrementing_function (SCM a, SCM flag) | |
135 | @{ | |
136 | int result; | |
137 | ||
138 | if (scm_is_true (flag)) | |
139 | result = scm_to_int (a) + 1; | |
140 | else | |
141 | result = scm_to_int (a); | |
142 | ||
143 | return scm_from_int (result); | |
144 | @} | |
145 | @end example | |
146 | ||
147 | This version is much less general than the original one: it will only | |
148 | work for values @var{A} that can fit into a @code{int}. The original | |
149 | function will work for all values that Guile can represent and that | |
150 | @code{scm_sum} can understand, including integers bigger than @code{long | |
151 | long}, floating point numbers, complex numbers, and new numerical types | |
152 | that have been added to Guile by third-party libraries. | |
153 | ||
154 | Also, computing with @code{SCM} is not necessarily inefficient. Small | |
155 | integers will be encoded directly in the @code{SCM} value, for example, | |
8680d53b AW |
156 | and do not need any additional memory on the heap. See @ref{The |
157 | Libguile Runtime Environment} to find out the details. | |
3229f68b MV |
158 | |
159 | Some special @code{SCM} values are available to C code without needing | |
160 | to convert them from C values: | |
161 | ||
162 | @multitable {Scheme value} {C representation} | |
163 | @item Scheme value @tab C representation | |
164 | @item @nicode{#f} @tab @nicode{SCM_BOOL_F} | |
165 | @item @nicode{#t} @tab @nicode{SCM_BOOL_T} | |
166 | @item @nicode{()} @tab @nicode{SCM_EOL} | |
167 | @end multitable | |
168 | ||
169 | In addition to @code{SCM}, Guile also defines the related type | |
170 | @code{scm_t_bits}. This is an unsigned integral type of sufficient | |
171 | size to hold all information that is directly contained in a | |
172 | @code{SCM} value. The @code{scm_t_bits} type is used internally by | |
8680d53b AW |
173 | Guile to do all the bit twiddling explained in @ref{The Libguile |
174 | Runtime Environment}, but you will encounter it occasionally in low-level | |
3229f68b MV |
175 | user code as well. |
176 | ||
177 | ||
178 | @node Garbage Collection | |
179 | @subsection Garbage Collection | |
180 | ||
181 | As explained above, the @code{SCM} type can represent all Scheme values. | |
182 | Some values fit entirely into a @code{SCM} value (such as small | |
183 | integers), but other values require additional storage in the heap (such | |
184 | as strings and vectors). This additional storage is managed | |
185 | automatically by Guile. You don't need to explicitely deallocate it | |
186 | when a @code{SCM} value is no longer used. | |
187 | ||
188 | Two things must be guaranteed so that Guile is able to manage the | |
189 | storage automatically: it must know about all blocks of memory that have | |
190 | ever been allocated for Scheme values, and it must know about all Scheme | |
191 | values that are still being used. Given this knowledge, Guile can | |
192 | periodically free all blocks that have been allocated but are not used | |
193 | by any active Scheme values. This activity is called @dfn{garbage | |
194 | collection}. | |
195 | ||
8c3fa3e5 | 196 | It is easy for Guile to remember all blocks of memory that it has |
3229f68b MV |
197 | allocated for use by Scheme values, but you need to help it with finding |
198 | all Scheme values that are in use by C code. | |
199 | ||
200 | You do this when writing a SMOB mark function, for example | |
201 | (@pxref{Garbage Collecting Smobs}). By calling this function, the | |
202 | garbage collector learns about all references that your SMOB has to | |
203 | other @code{SCM} values. | |
204 | ||
205 | Other references to @code{SCM} objects, such as global variables of type | |
206 | @code{SCM} or other random data structures in the heap that contain | |
207 | fields of type @code{SCM}, can be made visible to the garbage collector | |
208 | by calling the functions @code{scm_gc_protect} or | |
209 | @code{scm_permanent_object}. You normally use these funtions for long | |
210 | lived objects such as a hash table that is stored in a global variable. | |
211 | For temporary references in local variables or function arguments, using | |
212 | these functions would be too expensive. | |
213 | ||
214 | These references are handled differently: Local variables (and function | |
215 | arguments) of type @code{SCM} are automatically visible to the garbage | |
216 | collector. This works because the collector scans the stack for | |
217 | potential references to @code{SCM} objects and considers all referenced | |
218 | objects to be alive. The scanning considers each and every word of the | |
219 | stack, regardless of what it is actually used for, and then decides | |
8c3fa3e5 | 220 | whether it could possibly be a reference to a @code{SCM} object. Thus, |
3229f68b MV |
221 | the scanning is guaranteed to find all actual references, but it might |
222 | also find words that only accidentally look like references. These | |
223 | `false positives' might keep @code{SCM} objects alive that would | |
224 | otherwise be considered dead. While this might waste memory, keeping an | |
225 | object around longer than it strictly needs to is harmless. This is why | |
226 | this technique is called ``conservative garbage collection''. In | |
227 | practice, the wasted memory seems to be no problem. | |
228 | ||
229 | The stack of every thread is scanned in this way and the registers of | |
230 | the CPU and all other memory locations where local variables or function | |
231 | parameters might show up are included in this scan as well. | |
232 | ||
233 | The consequence of the conservative scanning is that you can just | |
234 | declare local variables and function parameters of type @code{SCM} and | |
235 | be sure that the garbage collector will not free the corresponding | |
236 | objects. | |
237 | ||
238 | However, a local variable or function parameter is only protected as | |
239 | long as it is really on the stack (or in some register). As an | |
240 | optimization, the C compiler might reuse its location for some other | |
241 | value and the @code{SCM} object would no longer be protected. Normally, | |
242 | this leads to exactly the right behabvior: the compiler will only | |
243 | overwrite a reference when it is no longer needed and thus the object | |
244 | becomes unprotected precisely when the reference disappears, just as | |
245 | wanted. | |
246 | ||
247 | There are situations, however, where a @code{SCM} object needs to be | |
248 | around longer than its reference from a local variable or function | |
384138c4 MV |
249 | parameter. This happens, for example, when you retrieve some pointer |
250 | from a smob and work with that pointer directly. The reference to the | |
251 | @code{SCM} smob object might be dead after the pointer has been | |
252 | retrieved, but the pointer itself (and the memory pointed to) is still | |
253 | in use and thus the smob object must be protected. The compiler does | |
254 | not know about this connection and might overwrite the @code{SCM} | |
255 | reference too early. | |
3229f68b MV |
256 | |
257 | To get around this problem, you can use @code{scm_remember_upto_here_1} | |
258 | and its cousins. It will keep the compiler from overwriting the | |
259 | reference. For a typical example of its use, see @ref{Remembering | |
260 | During Operations}. | |
261 | ||
262 | @node Control Flow | |
263 | @subsection Control Flow | |
264 | ||
265 | Scheme has a more general view of program flow than C, both locally and | |
266 | non-locally. | |
267 | ||
268 | Controlling the local flow of control involves things like gotos, loops, | |
269 | calling functions and returning from them. Non-local control flow | |
270 | refers to situations where the program jumps across one or more levels | |
271 | of function activations without using the normal call or return | |
272 | operations. | |
273 | ||
274 | The primitive means of C for local control flow is the @code{goto} | |
275 | statement, together with @code{if}. Loops done with @code{for}, | |
276 | @code{while} or @code{do} could in principle be rewritten with just | |
277 | @code{goto} and @code{if}. In Scheme, the primitive means for local | |
278 | control flow is the @emph{function call} (together with @code{if}). | |
279 | Thus, the repetition of some computation in a loop is ultimately | |
280 | implemented by a function that calls itself, that is, by recursion. | |
281 | ||
282 | This approach is theoretically very powerful since it is easier to | |
283 | reason formally about recursion than about gotos. In C, using | |
8c3fa3e5 | 284 | recursion exclusively would not be practical, though, since it would eat |
3229f68b MV |
285 | up the stack very quickly. In Scheme, however, it is practical: |
286 | function calls that appear in a @dfn{tail position} do not use any | |
51545a90 | 287 | additional stack space (@pxref{Tail Calls}). |
3229f68b MV |
288 | |
289 | A function call is in a tail position when it is the last thing the | |
290 | calling function does. The value returned by the called function is | |
291 | immediately returned from the calling function. In the following | |
292 | example, the call to @code{bar-1} is in a tail position, while the | |
293 | call to @code{bar-2} is not. (The call to @code{1-} in @code{foo-2} | |
8c3fa3e5 | 294 | is in a tail position, though.) |
3229f68b MV |
295 | |
296 | @lisp | |
297 | (define (foo-1 x) | |
298 | (bar-1 (1- x))) | |
299 | ||
300 | (define (foo-2 x) | |
301 | (1- (bar-2 x))) | |
302 | @end lisp | |
303 | ||
304 | Thus, when you take care to recurse only in tail positions, the | |
305 | recursion will only use constant stack space and will be as good as a | |
306 | loop constructed from gotos. | |
307 | ||
308 | Scheme offers a few syntactic abstractions (@code{do} and @dfn{named} | |
309 | @code{let}) that make writing loops slightly easier. | |
310 | ||
311 | But only Scheme functions can call other functions in a tail position: | |
312 | C functions can not. This matters when you have, say, two functions | |
313 | that call each other recursively to form a common loop. The following | |
314 | (unrealistic) example shows how one might go about determing whether a | |
315 | non-negative integer @var{n} is even or odd. | |
316 | ||
317 | @lisp | |
318 | (define (my-even? n) | |
319 | (cond ((zero? n) #t) | |
320 | (else (my-odd? (1- n))))) | |
321 | ||
322 | (define (my-odd? n) | |
323 | (cond ((zero? n) #f) | |
324 | (else (my-even? (1- n))))) | |
325 | @end lisp | |
326 | ||
327 | Because the calls to @code{my-even?} and @code{my-odd?} are in tail | |
328 | positions, these two procedures can be applied to arbitrary large | |
329 | integers without overflowing the stack. (They will still take a lot | |
330 | of time, of course.) | |
331 | ||
332 | However, when one or both of the two procedures would be rewritten in | |
333 | C, it could no longer call its companion in a tail position (since C | |
334 | does not have this concept). You might need to take this | |
335 | consideration into account when deciding which parts of your program | |
336 | to write in Scheme and which in C. | |
337 | ||
338 | In addition to calling functions and returning from them, a Scheme | |
339 | program can also exit non-locally from a function so that the control | |
340 | flow returns directly to an outer level. This means that some functions | |
341 | might not return at all. | |
342 | ||
343 | Even more, it is not only possible to jump to some outer level of | |
344 | control, a Scheme program can also jump back into the middle of a | |
345 | function that has already exited. This might cause some functions to | |
346 | return more than once. | |
347 | ||
348 | In general, these non-local jumps are done by invoking | |
349 | @dfn{continuations} that have previously been captured using | |
350 | @code{call-with-current-continuation}. Guile also offers a slightly | |
351 | restricted set of functions, @code{catch} and @code{throw}, that can | |
352 | only be used for non-local exits. This restriction makes them more | |
353 | efficient. Error reporting (with the function @code{error}) is | |
354 | implemented by invoking @code{throw}, for example. The functions | |
355 | @code{catch} and @code{throw} belong to the topic of @dfn{exceptions}. | |
356 | ||
357 | Since Scheme functions can call C functions and vice versa, C code can | |
358 | experience the more general control flow of Scheme as well. It is | |
359 | possible that a C function will not return at all, or will return more | |
360 | than once. While C does offer @code{setjmp} and @code{longjmp} for | |
361 | non-local exits, it is still an unusual thing for C code. In | |
362 | contrast, non-local exits are very common in Scheme, mostly to report | |
363 | errors. | |
364 | ||
365 | You need to be prepared for the non-local jumps in the control flow | |
366 | whenever you use a function from @code{libguile}: it is best to assume | |
367 | that any @code{libguile} function might signal an error or run a pending | |
368 | signal handler (which in turn can do arbitrary things). | |
369 | ||
370 | It is often necessary to take cleanup actions when the control leaves a | |
371 | function non-locally. Also, when the control returns non-locally, some | |
372 | setup actions might be called for. For example, the Scheme function | |
373 | @code{with-output-to-port} needs to modify the global state so that | |
374 | @code{current-output-port} returns the port passed to | |
375 | @code{with-output-to-port}. The global output port needs to be reset to | |
376 | its previous value when @code{with-output-to-port} returns normally or | |
377 | when it is exited non-locally. Likewise, the port needs to be set again | |
378 | when control enters non-locally. | |
379 | ||
661ae7ab MV |
380 | Scheme code can use the @code{dynamic-wind} function to arrange for |
381 | the setting and resetting of the global state. C code can use the | |
382 | corresponding @code{scm_internal_dynamic_wind} function, or a | |
383 | @code{scm_dynwind_begin}/@code{scm_dynwind_end} pair together with | |
384 | suitable 'dynwind actions' (@pxref{Dynamic Wind}). | |
3229f68b | 385 | |
384138c4 | 386 | Instead of coping with non-local control flow, you can also prevent it |
8c3fa3e5 | 387 | by erecting a @emph{continuation barrier}, @xref{Continuation |
384138c4 MV |
388 | Barriers}. The function @code{scm_c_with_continuation_barrier}, for |
389 | example, is guaranteed to return exactly once. | |
390 | ||
b4fddbbe MV |
391 | @node Asynchronous Signals |
392 | @subsection Asynchronous Signals | |
393 | ||
394 | You can not call libguile functions from handlers for POSIX signals, but | |
395 | you can register Scheme handlers for POSIX signals such as | |
396 | @code{SIGINT}. These handlers do not run during the actual signal | |
397 | delivery. Instead, they are run when the program (more precisely, the | |
398 | thread that the handler has been registered for) reaches the next | |
399 | @emph{safe point}. | |
400 | ||
401 | The libguile functions themselves have many such safe points. | |
402 | Consequently, you must be prepared for arbitrary actions anytime you | |
403 | call a libguile function. For example, even @code{scm_cons} can contain | |
404 | a safe point and when a signal handler is pending for your thread, | |
405 | calling @code{scm_cons} will run this handler and anything might happen, | |
406 | including a non-local exit although @code{scm_cons} would not ordinarily | |
407 | do such a thing on its own. | |
408 | ||
409 | If you do not want to allow the running of asynchronous signal handlers, | |
661ae7ab | 410 | you can block them temporarily with @code{scm_dynwind_block_asyncs}, for |
b4fddbbe MV |
411 | example. See @xref{System asyncs}. |
412 | ||
413 | Since signal handling in Guile relies on safe points, you need to make | |
414 | sure that your functions do offer enough of them. Normally, calling | |
415 | libguile functions in the normal course of action is all that is needed. | |
416 | But when a thread might spent a long time in a code section that calls | |
417 | no libguile function, it is good to include explicit safe points. This | |
418 | can allow the user to interrupt your code with @key{C-c}, for example. | |
419 | ||
420 | You can do this with the macro @code{SCM_TICK}. This macro is | |
421 | syntactically a statement. That is, you could use it like this: | |
422 | ||
423 | @example | |
424 | while (1) | |
425 | @{ | |
426 | SCM_TICK; | |
427 | do_some_work (); | |
428 | @} | |
429 | @end example | |
430 | ||
431 | Frequent execution of a safe point is even more important in multi | |
432 | threaded programs, @xref{Multi-Threading}. | |
433 | ||
434 | @node Multi-Threading | |
435 | @subsection Multi-Threading | |
436 | ||
437 | Guile can be used in multi-threaded programs just as well as in | |
438 | single-threaded ones. | |
439 | ||
440 | Each thread that wants to use functions from libguile must put itself | |
441 | into @emph{guile mode} and must then follow a few rules. If it doesn't | |
442 | want to honor these rules in certain situations, a thread can | |
443 | temporarily leave guile mode (but can no longer use libguile functions | |
444 | during that time, of course). | |
445 | ||
446 | Threads enter guile mode by calling @code{scm_with_guile}, | |
447 | @code{scm_boot_guile}, or @code{scm_init_guile}. As explained in the | |
448 | reference documentation for these functions, Guile will then learn about | |
449 | the stack bounds of the thread and can protect the @code{SCM} values | |
450 | that are stored in local variables. When a thread puts itself into | |
451 | guile mode for the first time, it gets a Scheme representation and is | |
452 | listed by @code{all-threads}, for example. | |
453 | ||
54428bb8 MV |
454 | While in guile mode, a thread promises to reach a safe point |
455 | reasonably frequently (@pxref{Asynchronous Signals}). In addition to | |
456 | running signal handlers, these points are also potential rendezvous | |
457 | points of all guile mode threads where Guile can orchestrate global | |
458 | things like garbage collection. Consequently, when a thread in guile | |
459 | mode blocks and does no longer frequent safe points, it might cause | |
460 | all other guile mode threads to block as well. To prevent this from | |
461 | happening, a guile mode thread should either only block in libguile | |
462 | functions (who know how to do it right), or should temporarily leave | |
463 | guile mode with @code{scm_without_guile}. | |
b4fddbbe MV |
464 | |
465 | For some common blocking operations, Guile provides convenience | |
466 | functions. For example, if you want to lock a pthread mutex while in | |
467 | guile mode, you might want to use @code{scm_pthread_mutex_lock} which is | |
468 | just like @code{pthread_mutex_lock} except that it leaves guile mode | |
469 | while blocking. | |
470 | ||
471 | ||
472 | All libguile functions are (intended to be) robust in the face of | |
473 | multiple threads using them concurrently. This means that there is no | |
474 | risk of the internal data structures of libguile becoming corrupted in | |
475 | such a way that the process crashes. | |
476 | ||
08365ce4 | 477 | A program might still produce nonsensical results, though. Taking |
b4fddbbe MV |
478 | hashtables as an example, Guile guarantees that you can use them from |
479 | multiple threads concurrently and a hashtable will always remain a valid | |
480 | hashtable and Guile will not crash when you access it. It does not | |
481 | guarantee, however, that inserting into it concurrently from two threads | |
482 | will give useful results: only one insertion might actually happen, none | |
483 | might happen, or the table might in general be modified in a totally | |
484 | arbitrary manner. (It will still be a valid hashtable, but not the one | |
485 | that you might have expected.) Guile might also signal an error when it | |
486 | detects a harmful race condition. | |
487 | ||
488 | Thus, you need to put in additional synchronizations when multiple | |
489 | threads want to use a single hashtable, or any other mutable Scheme | |
490 | object. | |
491 | ||
492 | When writing C code for use with libguile, you should try to make it | |
493 | robust as well. An example that converts a list into a vector will help | |
494 | to illustrate. Here is a correct version: | |
495 | ||
496 | @example | |
497 | SCM | |
498 | my_list_to_vector (SCM list) | |
499 | @{ | |
500 | SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED); | |
501 | size_t len, i; | |
502 | ||
503 | len = SCM_SIMPLE_VECTOR_LENGTH (vector); | |
504 | i = 0; | |
505 | while (i < len && scm_is_pair (list)) | |
506 | @{ | |
507 | SCM_SIMPLE_VECTOR_SET (vector, i, SCM_CAR (list)); | |
508 | list = SCM_CDR (list); | |
509 | i++; | |
510 | @} | |
511 | ||
512 | return vector; | |
513 | @} | |
514 | @end example | |
515 | ||
516 | The first thing to note is that storing into a @code{SCM} location | |
517 | concurrently from multiple threads is guaranteed to be robust: you don't | |
518 | know which value wins but it will in any case be a valid @code{SCM} | |
519 | value. | |
520 | ||
521 | But there is no guarantee that the list referenced by @var{list} is not | |
522 | modified in another thread while the loop iterates over it. Thus, while | |
523 | copying its elements into the vector, the list might get longer or | |
524 | shorter. For this reason, the loop must check both that it doesn't | |
525 | overrun the vector (@code{SCM_SIMPLE_VECTOR_SET} does no range-checking) | |
526 | and that it doesn't overrung the list (@code{SCM_CAR} and @code{SCM_CDR} | |
527 | likewise do no type checking). | |
528 | ||
529 | It is safe to use @code{SCM_CAR} and @code{SCM_CDR} on the local | |
530 | variable @var{list} once it is known that the variable contains a pair. | |
531 | The contents of the pair might change spontaneously, but it will always | |
532 | stay a valid pair (and a local variable will of course not spontaneously | |
533 | point to a different Scheme object). | |
534 | ||
535 | Likewise, a simple vector such as the one returned by | |
536 | @code{scm_make_vector} is guaranteed to always stay the same length so | |
537 | that it is safe to only use SCM_SIMPLE_VECTOR_LENGTH once and store the | |
538 | result. (In the example, @var{vector} is safe anyway since it is a | |
539 | fresh object that no other thread can possibly know about until it is | |
540 | returned from @code{my_list_to_vector}.) | |
541 | ||
542 | Of course the behavior of @code{my_list_to_vector} is suboptimal when | |
8c3fa3e5 | 543 | @var{list} does indeed get asynchronously lengthened or shortened in |
b4fddbbe MV |
544 | another thread. But it is robust: it will always return a valid vector. |
545 | That vector might be shorter than expected, or its last elements might | |
546 | be unspecified, but it is a valid vector and if a program wants to rule | |
547 | out these cases, it must avoid modifying the list asynchronously. | |
548 | ||
549 | Here is another version that is also correct: | |
550 | ||
551 | @example | |
552 | SCM | |
553 | my_pedantic_list_to_vector (SCM list) | |
554 | @{ | |
555 | SCM vector = scm_make_vector (scm_length (list), SCM_UNDEFINED); | |
556 | size_t len, i; | |
557 | ||
558 | len = SCM_SIMPLE_VECTOR_LENGTH (vector); | |
559 | i = 0; | |
560 | while (i < len) | |
561 | @{ | |
562 | SCM_SIMPLE_VECTOR_SET (vector, i, scm_car (list)); | |
563 | list = scm_cdr (list); | |
564 | i++; | |
565 | @} | |
566 | ||
567 | return vector; | |
568 | @} | |
569 | @end example | |
570 | ||
571 | This version uses the type-checking and thread-robust functions | |
572 | @code{scm_car} and @code{scm_cdr} instead of the faster, but less robust | |
573 | macros @code{SCM_CAR} and @code{SCM_CDR}. When the list is shortened | |
574 | (that is, when @var{list} holds a non-pair), @code{scm_car} will throw | |
575 | an error. This might be preferable to just returning a half-initialized | |
576 | vector. | |
577 | ||
578 | The API for accessing vectors and arrays of various kinds from C takes a | |
579 | slightly different approach to thread-robustness. In order to get at | |
580 | the raw memory that stores the elements of an array, you need to | |
581 | @emph{reserve} that array as long as you need the raw memory. During | |
582 | the time an array is reserved, its elements can still spontaneously | |
583 | change their values, but the memory itself and other things like the | |
584 | size of the array are guaranteed to stay fixed. Any operation that | |
585 | would change these parameters of an array that is currently reserved | |
586 | will signal an error. In order to avoid these errors, a program should | |
587 | of course put suitable synchronization mechanisms in place. As you can | |
588 | see, Guile itself is again only concerned about robustness, not about | |
589 | correctness: without proper synchronization, your program will likely | |
590 | not be correct, but the worst consequence is an error message. | |
32106a5d MV |
591 | |
592 | Real thread-safeness often requires that a critical section of code is | |
593 | executed in a certain restricted manner. A common requirement is that | |
594 | the code section is not entered a second time when it is already being | |
595 | executed. Locking a mutex while in that section ensures that no other | |
596 | thread will start executing it, blocking asyncs ensures that no | |
597 | asynchronous code enters the section again from the current thread, | |
598 | and the error checking of Guile mutexes guarantees that an error is | |
599 | signalled when the current thread accidentally reenters the critical | |
600 | section via recursive function calls. | |
601 | ||
602 | Guile provides two mechanisms to support critical sections as outlined | |
603 | above. You can either use the macros | |
604 | @code{SCM_CRITICAL_SECTION_START} and @code{SCM_CRITICAL_SECTION_END} | |
661ae7ab MV |
605 | for very simple sections; or use a dynwind context together with a |
606 | call to @code{scm_dynwind_critical_section}. | |
32106a5d MV |
607 | |
608 | The macros only work reliably for critical sections that are | |
609 | guaranteed to not cause a non-local exit. They also do not detect an | |
610 | accidental reentry by the current thread. Thus, you should probably | |
611 | only use them to delimit critical sections that do not contain calls | |
612 | to libguile functions or to other external functions that might do | |
613 | complicated things. | |
614 | ||
661ae7ab MV |
615 | The function @code{scm_dynwind_critical_section}, on the other hand, |
616 | will correctly deal with non-local exits because it requires a dynwind | |
617 | context. Also, by using a separate mutex for each critical section, | |
618 | it can detect accidental reentries. |