Import Debian changes 20180207-1
[hcoop/debian/mlton.git] / doc / hacker-guide / basis-library.tex
CommitLineData
7f918cf1
CE
1\chap{The basis library}{basis-library}
2
3The basis library is implemented with about 12,000 lines of SML code. There is
4roughly one file for each signature and structure that the library specification
5defines. The files are grouped in directories in the same way that the
6corresponding modules are grouped in the basis library documentation. Here is
7an overview of the {\tt basis-library} directory.
8
9\begin{description}
10\place{arrays-and-vectors general integer io list posix real system text}
11SML code for basis library modules.
12
13\place{basis.sml}
14Automatically constructed by {\tt bin/check-basis}. Used to type check the
15basis libary under {\smlnj}.
16
17\place{bind-basis}
18A list of the files (in order) that define what is exported by the basis
19library.
20
21\place{build-basis}
22A list of the files (in order) used to construct the basis library.
23
24\place{Makefile}
25Only has a target to clean the directory.
26
27\place{misc}
28SML code that didn't fit anywhere else. In particular, the {\tt Primitive}
29structure.
30
31\place{mlton}
32The {\tt MLton} structure, which is not part of the standard basis library.
33For more details on what {\tt MLton} provides, see the {\userguide}.
34
35\place{sml-nj}
36The {\tt SMLofNJ} and {\tt Unsafe} structures, which are not part of the
37standard basis library.
38
39\place{top-level}
40Files describing the overloads, infixes, modules, types, and values that the
41basis library makes available to user programs.
42\end{description}
43
44\subsec{How {\mlton} builds the basis environment}{build-basis-env}
45The {\tt forceBasisLibrary} function in \code{\tt mlton/main/compile.sml} builds
46the basis environment that is used to compile user programs. Conceptually, the
47basis environment is constructed in two steps. First, all of the files in {\tt
48build-basis} are concatenated together and evaluated to produce an environment
49$E$. Then, all of the files in {\tt bind-basis} are concatenated and evaluated
50in environment $E$ to produce a new environment $E'$, which is the top-level
51environment. Another way to view it is that every user program is prefixed by
52the following.
53\begin{verbatim}
54local
55 <concatenate files in build-basis>
56in
57 <concatenate files in bind-basis>
58end
59\end{verbatim}
60This view is not strictly accurate because some of the files are not SML (they
61use the {\tt \_prim}, {\tt \_ffi}, and {\tt \_overload} syntaxes) and because SML
62does not allow local functor or signature declarations. Here is a description
63of the basis files that are not SML.
64\begin{description}
65\place{misc/primitive.sml}
66Defines the {\tt Primitive} structure, which binds (via the {\tt \_prim}
67syntax) all of the primitives provided by the compiler that the basis library
68uses.
69\place{mlton/syslog.sml}
70Defines constants and FFI routines used to implement {\tt MLton.Syslog}.
71\place{posix/primitive.sml}
72Defines the {\tt PosixPrimitive} structrue, which binds the constants and FFI
73routines used to implement the {\tt Posix} structure.
74\place{top-level/overloads.sml}
75Defines the overloaded variables available at the top-level the {\tt \_overload}
76syntax: {\tt \_overload $x$: $ty$ as $y_0$ and $y_1$ and ...}
77\end{description}
78
79\subsection{Modifying the basis library}
80
81If you modify the basis library, you should first check that your modifications
82are type correct using the {\tt bin/check-basis} script. Since this {\mlton}
83does not have a proper typechecker, this script uses {\smlnj}. First, it
84concatenates the files as described in \secref{build-basis-env} into one file,
85{\tt basis.sml}. It also replaces the nonstandard syntax ({\tt \_prim}, etc.)
86and declares the toplevel types to match {\mlton}'s (necessary since {\smlnj}
87uses 31 bits while {\mlton} uses 32). It then feeds {\tt basis.sml} to
88{\smlnj}. If there are no type errors, a message like the following will
89appear.
90\begin{verbatim}
91stdIn:12213.1-12213.14 Error: operator is not a function [tycon mismatch]
92 operator: unit
93 in expression:
94 () ()
95\end{verbatim}
96This error message is intentionally introduced by {\tt check-basis} at the end
97of {\tt basis.sml} to make it clear that {\smlnj} reached the end of {\tt
98basis.sml} and has hence type checked the entire basis.
99
100Once you have a basis library that type checks, you need to create a new version
101of {\mlton} that uses this library. {\mlton} preprocess the basis library to
102create a {\tt world.mlton} file that contains the basis environment. The {\tt
103world.mlton} file is stored in the {\tt lib} directory and is loaded by {\tt
104mlton} when compiling a user program (see the {\tt bin/mlton} script). To build
105a new {\tt world.mlton}, run {\tt make world} from within the sources directory.
106
107\subsection{The {\tt misc} directory}
108
109\begin{description}
110
111\place{cleaner.sig}
112Functions for register ``cleaning'' functions to be run at certain times, in
113particular at program exit. The {\tt TextIO} module uses these cleaners to
114ensure that IO buffers are flushed upon exit.
115
116\place{suffix.sml}
117Code that is (conceptually) concatenated on to the end of every user program.
118It just calls {\tt OS.Process.exit}. The {\tt forceBasisLibrary} function
119ensures that {\tt suffix.sml} is elaborated in an environment where the basis
120library {\tt OS} structure is available.
121
122\place{top-level-handler.sml}
123This defines the top level exception handler that is installed (via a special
124compiler primitive) in the basis library, before any user code is run.
125
126\end{description}
127
128\subsection{Dead-code elimination}
129
130In order to compile small programs rapidly and to cut down on executable size,
131{\tt mlton} runs a pass of dead-code elimination ({\tt
132mlton/core-ml/dead-code.sig}) to eliminate as much of the basis library as
133possible. The dead-code elimination algorithm used is not safe in general, and
134only works because the basis library implementation has special properties:
135\begin{itemize}
136\item it terminates
137\item it performs no I/O
138\item it doesn't side-effect top-level variables
139\end{itemize}
140The dead code elimination simply includes the minimal set of
141declarations from the basis so that their are no free variables in the
142user program (or basis). Hence, if you do something like the
143following in the basis, it will break.
144\begin{verbatim}
145val r = ref 13
146val _ = r := 14
147\end{verbatim}
148The dead code elimination will remove the {\tt val \_ = ...} binding.