Backport from sid to buster
[hcoop/debian/mlton.git] / doc / guide / src / MLNLFFIImplementation.adoc
CommitLineData
7f918cf1
CE
1MLNLFFIImplementation
2=====================
3
4MLton's implementation(s) of the <:MLNLFFI:> library differs from the
5SML/NJ implementation in two important ways:
6
7* MLton cannot utilize the `Unsafe.cast` "cheat" described in Section
83.7 of <!Cite(Blume01)>. (MLton's representation of
9<:Closure:closures> and
10<:PackedRepresentation:aggressive representation> optimizations make
11an `Unsafe.cast` even more "unsafe" than in other implementations.)
12+
13--
14We have considered two solutions:
15
16** One solution is to utilize an additional type parameter (as
17described in Section 3.7 of <!Cite(Blume01)>):
18+
19--
20__________
21[source,sml]
22----
23signature C = sig
24 type ('t, 'f, 'c) obj
25 eqtype ('t, 'f, 'c) obj'
26 ...
27 type ('o, 'f) ptr
28 eqtype ('o, 'f) ptr'
29 ...
30 type 'f fptr
31 type 'f ptr'
32 ...
33 structure T : sig
34 type ('t, 'f) typ
35 ...
36 end
37end
38----
39
40The rule for `('t, 'f, 'c) obj`,`('t, 'f, 'c) ptr`, and also `('t, 'f)
41T.typ` is that whenever `F fptr` occurs within the instantiation of
42`'t`, then `'f` must be instantiated to `F`. In all other cases, `'f`
43will be instantiated to `unit`.
44__________
45
46(In the actual MLton implementation, an abstract type `naf`
47(not-a-function) is used instead of `unit`.)
48
49While this means that type-annotated programs may not type-check under
50both the SML/NJ implementation and the MLton implementation, this
51should not be a problem in practice. Tools, like `ml-nlffigen`, which
52are necessarily implementation dependent (in order to make
53<:CallingFromSMLToCFunctionPointer:calls through a C function
54pointer>), may be easily extended to emit the additional type
55parameter. Client code which uses such generated glue-code (e.g.,
56Section 1 of <!Cite(Blume01)>) need rarely write type-annotations,
57thanks to the magic of type inference.
58--
59
60** The above implementation suffers from two disadvantages.
61+
62--
63First, it changes the MLNLFFI Library interface, meaning that the same
64program may not type-check under both the SML/NJ implementation and
65the MLton implementation (though, in light of type inference and the
66richer `MLRep` structure provided by MLton, this point is mostly
67moot).
68
69Second, it appears to unnecessarily duplicate type information. For
70example, an external C variable of type `int (* f[3])(int)` (that is,
71an array of three function pointers), would be represented by the SML
72type `(((sint -> sint) fptr, dec dg3) arr, sint -> sint, rw) obj`.
73One might well ask why the `'f` instantiation (`sint -> sint` in this
74case) cannot be _extracted_ from the `'t` instantiation
75(`((sint -> sint) fptr, dec dg3) arr` in this case), obviating the
76need for a separate _function-type_ type argument. There are a number
77of components to an complete answer to this question. Foremost is the
78fact that <:StandardML: Standard ML> supports neither (general)
79type-level functions nor intensional polymorphism.
80
81A more direct answer for MLNLFFI is that in the SML/NJ implemention,
82the definition of the types `('t, 'c) obj` and `('t, 'c) ptr` are made
83in such a way that the type variables `'t` and `'c` are <:PhantomType:
84phantom> (not contributing to the run-time representation of an
85`('t, 'c) obj` or `('t, 'c) ptr` value), despite the fact that the
86types `((sint -> sint) fptr, rw) ptr` and
87`((double -> double) fptr, rw) ptr` necessarily carry distinct (and
88type incompatible) run-time (C-)type information (RTTI), corresponding
89to the different calling conventions of the two C functions. The
90`Unsafe.cast` "cheat" overcomes the type incompatibility without
91introducing a new type variable (as in the first solution above).
92
93Hence, the reason that _function-type_ type cannot be extracted from
94the `'t` type variable instantiation is that the type of the
95representation of RTTI doesn't even _see_ the (phantom) `'t` type
96variable. The solution which presents itself is to give up on the
97phantomness of the `'t` type variable, making it available to the
98representation of RTTI.
99
100This is not without some small drawbacks. Because many of the types
101used to instantiate `'t` carry more structure than is strictly
102necessary for `'t`'s RTTI, it is sometimes necessary to wrap and
103unwrap RTTI to accommodate the additional structure. (In the other
104implementations, the corresponding operations can pass along the RTTI
105unchanged.) However, these coercions contribute minuscule overhead;
106in fact, in a majority of cases, MLton's optimizations will completely
107eliminate the RTTI from the final program.
108--
109
110The implementation distributed with MLton uses the second solution.
111
112Bonus question: Why can't one use a <:UniversalType: universal type>
113to eliminate the use of `Unsafe.cast`?
114
115** Answer: ???
116--
117
118* MLton (in both of the above implementations) provides a richer
119`MLRep` structure, utilizing ++Int__<N>__++ and ++Word__<N>__++
120structures.
121+
122--
123[source,sml]
124-----
125structure MLRep = struct
126 structure Char =
127 struct
128 structure Signed = Int8
129 structure Unsigned = Word8
130 (* word-style bit-operations on integers... *)
131 structure <:SignedBitops:> = IntBitOps(structure I = Signed
132 structure W = Unsigned)
133 end
134 structure Short =
135 struct
136 structure Signed = Int16
137 structure Unsigned = Word16
138 (* word-style bit-operations on integers... *)
139 structure <:SignedBitops:> = IntBitOps(structure I = Signed
140 structure W = Unsigned)
141 end
142 structure Int =
143 struct
144 structure Signed = Int32
145 structure Unsigned = Word32
146 (* word-style bit-operations on integers... *)
147 structure <:SignedBitops:> = IntBitOps(structure I = Signed
148 structure W = Unsigned)
149 end
150 structure Long =
151 struct
152 structure Signed = Int32
153 structure Unsigned = Word32
154 (* word-style bit-operations on integers... *)
155 structure <:SignedBitops:> = IntBitOps(structure I = Signed
156 structure W = Unsigned)
157 end
158 structure <:LongLong:> =
159 struct
160 structure Signed = Int64
161 structure Unsigned = Word64
162 (* word-style bit-operations on integers... *)
163 structure <:SignedBitops:> = IntBitOps(structure I = Signed
164 structure W = Unsigned)
165 end
166 structure Float = Real32
167 structure Double = Real64
168end
169----
170
171This would appear to be a better interface, even when an
172implementation must choose `Int32` and `Word32` as the representation
173for smaller C-types.
174--