Copyright (c) 2001, 2002, Lucent Technologies, Bell Laboratories author: Matthias Blume (blume@research.bell-labs.com) This directory contains ML-NLFFI-Gen, a glue-code generator for the new "NLFFI" foreign function interface. The generator reads C source code and emits ML code along with a description file for CM. Compiling this generator requires the C-Kit ($/ckit-lib.cm) to be installed. --------------------------------------------------------------------- February 21, 2002: Major changes: I reworked the glue code generator in a way that lets generated code scale better -- at the expense of some (mostly academic) generality. Changes involve the following: 1. The functorization is gone. 2. Every top-level C declaration results in a separate top-level ML equivalent (implemented by its own ML source file). 3. Incomplete pointer types are treated just like their complete versions -- the only difference being that no RTTI will be available for them. In the "light" interface, this rules out precisely those operations over them that C would disallow. 4. All related C sources must be supplied to ml-nlffigen together. Types incomplete in one source but complete in another get automatically completed in a cross-file fashion. 5. The handle for the shared library to link to is now abstracted as a function closure. Moreover, it must be supplied as a top-level variable (by the programmer). For this purpose, ml-nlffigen has corresponding command-line options. These changes mean that even very large (in number of exported definitions) libraries such as, e.g., GTK can now be handled gracefully without reaching the limits of the ML compiler's abilities. [The example of GTK -- for which ml-nlffigen creates several thousands (!) of separate ML source files -- puts an unusal burden on CM, though. However, aside from running a bit longer than usual, CM handles loads of this magnitute just fine. Stabilizing the resulting library solves the problem entirely as far as later clients are concerned.] Sketch of translation- (and naming-) scheme: struct foo { ... } --> structure ST_foo in st-foo.sml (not exported) basic type info (name, size) & structure S_foo in s-foo.sml abstract interface to the type field accessors f_xxx (unless -light) and f_xxx' (unless -heavy) field types t_f_xxx field RTTI typ_f_xxx & (unless "-nosucvt" was set) structures IS_foo in /is-foo.sml (see discussion of struct *foo below) union foo { ... } --> structure UT_foo in ut-foo.sml (not exported) basic type info (name, size) & structure U_foo in u-foo.sml abstract interface to the type field accessors f_xxx (unless -light) and f_xxx' (unless -heavy) field types t_f_xxx field RTTI typ_f_xxx & (unless "-nosucvt" was set) structures IU_foo in /iu-foo.sml (see discussion of union *foo below) struct { ... } like struct { ... }, where is a fresh integer or 'bar if 'struct { ... }' occurs in the context of a 'typedef struct { ... } bar' union { ... } like union { ... }, where is a fresh integer or 'bar if 'union { ... }' occurs in the context of a 'typedef union { ... } bar' enum foo { ... } --> structure E_foo in e-foo.sml external type mlrep with enum constants e_xxx conversion functions between tag enum and mlrep between mlrep and sint access functions (get/set) that operate on mlrep (as an alternative to C.Get.enum/C.Set.enum which operate on sint) If the command-line optino "-ec" ("-enum-constructors") was set and the values of all enum constants are different from each other, then mlrep will be a datatype (thus making it possible to pattern-match). enum { ... } If this construct appears in the context of a surrounding (non-anonymous) struct or union or typedef, the enumeration gets assigned an artificial tag (just like similar structs and unions, see above). Unless the command-line option "-nocollect" was specified, then all constants in other (truly) unnamed enumerations will be collected into a single enumeration represented by structure E_'. This single enumeration is then treated like a regular enumeration (including handling of "-ec" -- see above). The default behavior ("collect") is to assign a fresh integer tag (again, just like in the struct/union case). T foo (T, ..., T) (global function/function prototype) --> structure F_foo in f-foo.sml containing three/four members: typ : RTTI fptr: thunkified fptr representing the C function maybe f' : light-weight function wrapper around fptr Turned off by -heavy (see below). maybe f : heavy-weight function wrapper around fptr Turned off by -light (see below). T foo; (global variable) --> structure G_foo in g-foo.sml containing three members: t : type typ : RTTI obj : thunkified object representing the C variable struct foo * (without existing definition of struct foo; incomplete type) --> an internal structure ST_foo with a type "tag" (just like in the struct foo { ... } case) The difference is that no structure S_foo will be generated, so there is no field-access interface and no RTTI (size or typ) for this. All "light-weight" functions referring to this pointer type will be generated, heavy-weight functions will be generated only if they do not require access to RTTI. If "-heavy" was specified but a heavy interface function cannot be generated because of incomplete types, then its light counterpart will be issued generated anyway. union foo * Same as with struct foo *, but replace S_foo with U_foo and ST_foo with UT_foo. Additional files for implementing function entry sequences are created and used internally. They do not contribute exports, though. Command-line options for ml-nlffigen: General syntax: ml-nlffigen