Import Debian changes 20180207-1
[hcoop/debian/mlton.git] / doc / hacker-guide / basis-library.tex
1 \chap{The basis library}{basis-library}
2
3 The basis library is implemented with about 12,000 lines of SML code. There is
4 roughly one file for each signature and structure that the library specification
5 defines. The files are grouped in directories in the same way that the
6 corresponding modules are grouped in the basis library documentation. Here is
7 an overview of the {\tt basis-library} directory.
8
9 \begin{description}
10 \place{arrays-and-vectors general integer io list posix real system text}
11 SML code for basis library modules.
12
13 \place{basis.sml}
14 Automatically constructed by {\tt bin/check-basis}. Used to type check the
15 basis libary under {\smlnj}.
16
17 \place{bind-basis}
18 A list of the files (in order) that define what is exported by the basis
19 library.
20
21 \place{build-basis}
22 A list of the files (in order) used to construct the basis library.
23
24 \place{Makefile}
25 Only has a target to clean the directory.
26
27 \place{misc}
28 SML code that didn't fit anywhere else. In particular, the {\tt Primitive}
29 structure.
30
31 \place{mlton}
32 The {\tt MLton} structure, which is not part of the standard basis library.
33 For more details on what {\tt MLton} provides, see the {\userguide}.
34
35 \place{sml-nj}
36 The {\tt SMLofNJ} and {\tt Unsafe} structures, which are not part of the
37 standard basis library.
38
39 \place{top-level}
40 Files describing the overloads, infixes, modules, types, and values that the
41 basis library makes available to user programs.
42 \end{description}
43
44 \subsec{How {\mlton} builds the basis environment}{build-basis-env}
45 The {\tt forceBasisLibrary} function in \code{\tt mlton/main/compile.sml} builds
46 the basis environment that is used to compile user programs. Conceptually, the
47 basis environment is constructed in two steps. First, all of the files in {\tt
48 build-basis} are concatenated together and evaluated to produce an environment
49 $E$. Then, all of the files in {\tt bind-basis} are concatenated and evaluated
50 in environment $E$ to produce a new environment $E'$, which is the top-level
51 environment. Another way to view it is that every user program is prefixed by
52 the following.
53 \begin{verbatim}
54 local
55 <concatenate files in build-basis>
56 in
57 <concatenate files in bind-basis>
58 end
59 \end{verbatim}
60 This view is not strictly accurate because some of the files are not SML (they
61 use the {\tt \_prim}, {\tt \_ffi}, and {\tt \_overload} syntaxes) and because SML
62 does not allow local functor or signature declarations. Here is a description
63 of the basis files that are not SML.
64 \begin{description}
65 \place{misc/primitive.sml}
66 Defines the {\tt Primitive} structure, which binds (via the {\tt \_prim}
67 syntax) all of the primitives provided by the compiler that the basis library
68 uses.
69 \place{mlton/syslog.sml}
70 Defines constants and FFI routines used to implement {\tt MLton.Syslog}.
71 \place{posix/primitive.sml}
72 Defines the {\tt PosixPrimitive} structrue, which binds the constants and FFI
73 routines used to implement the {\tt Posix} structure.
74 \place{top-level/overloads.sml}
75 Defines the overloaded variables available at the top-level the {\tt \_overload}
76 syntax: {\tt \_overload $x$: $ty$ as $y_0$ and $y_1$ and ...}
77 \end{description}
78
79 \subsection{Modifying the basis library}
80
81 If you modify the basis library, you should first check that your modifications
82 are type correct using the {\tt bin/check-basis} script. Since this {\mlton}
83 does not have a proper typechecker, this script uses {\smlnj}. First, it
84 concatenates the files as described in \secref{build-basis-env} into one file,
85 {\tt basis.sml}. It also replaces the nonstandard syntax ({\tt \_prim}, etc.)
86 and declares the toplevel types to match {\mlton}'s (necessary since {\smlnj}
87 uses 31 bits while {\mlton} uses 32). It then feeds {\tt basis.sml} to
88 {\smlnj}. If there are no type errors, a message like the following will
89 appear.
90 \begin{verbatim}
91 stdIn:12213.1-12213.14 Error: operator is not a function [tycon mismatch]
92 operator: unit
93 in expression:
94 () ()
95 \end{verbatim}
96 This error message is intentionally introduced by {\tt check-basis} at the end
97 of {\tt basis.sml} to make it clear that {\smlnj} reached the end of {\tt
98 basis.sml} and has hence type checked the entire basis.
99
100 Once you have a basis library that type checks, you need to create a new version
101 of {\mlton} that uses this library. {\mlton} preprocess the basis library to
102 create a {\tt world.mlton} file that contains the basis environment. The {\tt
103 world.mlton} file is stored in the {\tt lib} directory and is loaded by {\tt
104 mlton} when compiling a user program (see the {\tt bin/mlton} script). To build
105 a new {\tt world.mlton}, run {\tt make world} from within the sources directory.
106
107 \subsection{The {\tt misc} directory}
108
109 \begin{description}
110
111 \place{cleaner.sig}
112 Functions for register ``cleaning'' functions to be run at certain times, in
113 particular at program exit. The {\tt TextIO} module uses these cleaners to
114 ensure that IO buffers are flushed upon exit.
115
116 \place{suffix.sml}
117 Code that is (conceptually) concatenated on to the end of every user program.
118 It just calls {\tt OS.Process.exit}. The {\tt forceBasisLibrary} function
119 ensures that {\tt suffix.sml} is elaborated in an environment where the basis
120 library {\tt OS} structure is available.
121
122 \place{top-level-handler.sml}
123 This defines the top level exception handler that is installed (via a special
124 compiler primitive) in the basis library, before any user code is run.
125
126 \end{description}
127
128 \subsection{Dead-code elimination}
129
130 In order to compile small programs rapidly and to cut down on executable size,
131 {\tt mlton} runs a pass of dead-code elimination ({\tt
132 mlton/core-ml/dead-code.sig}) to eliminate as much of the basis library as
133 possible. The dead-code elimination algorithm used is not safe in general, and
134 only works because the basis library implementation has special properties:
135 \begin{itemize}
136 \item it terminates
137 \item it performs no I/O
138 \item it doesn't side-effect top-level variables
139 \end{itemize}
140 The dead code elimination simply includes the minimal set of
141 declarations from the basis so that their are no free variables in the
142 user program (or basis). Hence, if you do something like the
143 following in the basis, it will break.
144 \begin{verbatim}
145 val r = ref 13
146 val _ = r := 14
147 \end{verbatim}
148 The dead code elimination will remove the {\tt val \_ = ...} binding.