Backport from sid to buster
[hcoop/debian/mlton.git] / doc / guide / localhost / Regions
1 <!DOCTYPE html>
2 <html lang="en">
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
5 <meta name="generator" content="AsciiDoc 8.6.9">
6 <title>Regions</title>
7 <link rel="stylesheet" href="./asciidoc.css" type="text/css">
8 <link rel="stylesheet" href="./pygments.css" type="text/css">
9
10
11 <script type="text/javascript" src="./asciidoc.js"></script>
12 <script type="text/javascript">
13 /*<![CDATA[*/
14 asciidoc.install();
15 /*]]>*/
16 </script>
17 <link rel="stylesheet" href="./mlton.css" type="text/css">
18 </head>
19 <body class="article">
20 <div id="banner">
21 <div id="banner-home">
22 <a href="./Home">MLton 20180207</a>
23 </div>
24 </div>
25 <div id="header">
26 <h1>Regions</h1>
27 </div>
28 <div id="content">
29 <div id="preamble">
30 <div class="sectionbody">
31 <div class="paragraph"><p>In region-based memory management, the heap is divided into a
32 collection of regions into which objects are allocated. At compile
33 time, either in the source program or through automatic inference,
34 allocation points are annotated with the region in which the
35 allocation will occur. Typically, although not always, the regions
36 are allocated and deallocated according to a stack discipline.</p></div>
37 <div class="paragraph"><p>MLton does not use region-based memory management; it uses traditional
38 <a href="GarbageCollection">GarbageCollection</a>. We have considered integrating regions with
39 MLton, but in our opinion it is far from clear that regions would
40 provide MLton with improved performance, while they would certainly
41 add a lot of complexity to the compiler and complicate reasoning about
42 and achieving <a href="SpaceSafety">SpaceSafety</a>. Region-based memory management and
43 garbage collection have different strengths and weaknesses; it&#8217;s
44 pretty easy to come up with programs that do significantly better
45 under regions than under GC, and vice versa. We believe that it is
46 the case that common SML idioms tend to work better under GC than
47 under regions.</p></div>
48 <div class="paragraph"><p>One common argument for regions is that the region operations can all
49 be done in (approximately) constant time; therefore, you eliminate GC
50 pause times, leading to a real-time GC. However, because of space
51 safety concerns (see below), we believe that region-based memory
52 management for SML must also include a traditional garbage collector.
53 Hence, to achieve real-time memory management for MLton/SML, we
54 believe that it would be both easier and more efficient to implement a
55 traditional real-time garbage collector than it would be to implement
56 a region system.</p></div>
57 </div>
58 </div>
59 <div class="sect1">
60 <h2 id="_regions_the_ml_kit_and_space_safety">Regions, the ML Kit, and space safety</h2>
61 <div class="sectionbody">
62 <div class="paragraph"><p>The <a href="MLKit">ML Kit</a> pioneered the use of regions for compiling
63 Standard ML. The ML Kit maintains a stack of regions at run time. At
64 compile time, it uses region inference to decide when data can be
65 allocated in a stack-like manner, assigning it to an appropriate
66 region. The ML Kit has put a lot of effort into improving the
67 supporting analyses and representations of regions, which are all
68 necessary to improve the performance.</p></div>
69 <div class="paragraph"><p>Unfortunately, under a pure stack-based region system, space leaks are
70 inevitable in theory, and costly in practice. Data for which region
71 inference can not determine the lifetime is moved into the "global
72 region" whose lifetime is the entire program. There are two ways in
73 which region inference will place an object to the global region.</p></div>
74 <div class="ulist"><ul>
75 <li>
76 <p>
77 When the inference is too conservative, that is, when the data is
78 used in a stack-like manner but the region inference can&#8217;t figure it
79 out.
80 </p>
81 </li>
82 <li>
83 <p>
84 When data is not used in a stack-like manner. In this case,
85 correctness requires region inference to place the object
86 </p>
87 </li>
88 </ul></div>
89 <div class="paragraph"><p>This global region is a source of space leaks. No matter what region
90 system you use, there are some programs such that the global region
91 must exist, and its size will grow to an unbounded multiple of the
92 live data size. For these programs one must have a GC to achieve
93 space safety.</p></div>
94 <div class="paragraph"><p>To solve this problem, the ML Kit has undergone work to combine
95 garbage collection with region-based memory management.
96 <a href="References#HallenbergEtAl02">HallenbergEtAl02</a> and <a href="References#Elsman03">Elsman03</a> describe the addition
97 of a garbage collector to the ML Kit&#8217;s region-based system. These
98 papers provide convincing evidence for space leaks in the global
99 region. They show a number of benchmarks where the memory usage of
100 the program running with just regions is a large multiple (2, 10, 50,
101 even 150) of the program running with regions plus GC.</p></div>
102 <div class="paragraph"><p>These papers also give some numbers to show the ML Kit with just
103 regions does better than either a system with just GC or a combined
104 system. Unfortunately, a pure region system isn&#8217;t practical because
105 of the lack of space safety. And the other performance numbers are
106 not so convincing, because they compare to an old version of SML/NJ
107 and not at all with MLton. It would be interesting to see a
108 comparison with a more serious collector.</p></div>
109 </div>
110 </div>
111 <div class="sect1">
112 <h2 id="_regions_garbage_collection_and_cyclone">Regions, Garbage Collection, and Cyclone</h2>
113 <div class="sectionbody">
114 <div class="paragraph"><p>One possibility is to take Cyclone&#8217;s approach, and provide both
115 region-based memory management and garbage collection, but at the
116 programmer&#8217;s option (<a href="References#GrossmanEtAl02">GrossmanEtAl02</a>, <a href="References#HicksEtAl03">HicksEtAl03</a>).</p></div>
117 <div class="paragraph"><p>One might ask whether we might do the same thing&#8201;&#8212;&#8201;i.e., provide a
118 <span class="monospaced">MLton.Regions</span> structure with explicit region based memory
119 management operations, so that the programmer could use them when
120 appropriate. <a href="MatthewFluet">MatthewFluet</a> has thought about this question</p></div>
121 <div class="ulist"><ul>
122 <li>
123 <p>
124 <a href="http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html">http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html</a>
125 </p>
126 </li>
127 </ul></div>
128 <div class="paragraph"><p>Unfortunately, his conclusion is that the SML type system is too weak
129 to support this option, although there might be a "poor-man&#8217;s" version
130 with dynamic checks.</p></div>
131 </div>
132 </div>
133 </div>
134 <div id="footnotes"><hr></div>
135 <div id="footer">
136 <div id="footer-text">
137 </div>
138 <div id="footer-badges">
139 </div>
140 </div>
141 </body>
142 </html>