Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / localhost / Regions
CommitLineData
7f918cf1
CE
1<!DOCTYPE html>\r
2<html lang="en">\r
3<head>\r
4<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r
5<meta name="generator" content="AsciiDoc 8.6.9">\r
6<title>Regions</title>\r
7<link rel="stylesheet" href="./asciidoc.css" type="text/css">\r
8<link rel="stylesheet" href="./pygments.css" type="text/css">\r
9\r
10\r
11<script type="text/javascript" src="./asciidoc.js"></script>\r
12<script type="text/javascript">\r
13/*<![CDATA[*/\r
14asciidoc.install();\r
15/*]]>*/\r
16</script>\r
17<link rel="stylesheet" href="./mlton.css" type="text/css">\r
18</head>\r
19<body class="article">\r
20<div id="banner">\r
21<div id="banner-home">\r
22<a href="./Home">MLton 20180207</a>\r
23</div>\r
24</div>\r
25<div id="header">\r
26<h1>Regions</h1>\r
27</div>\r
28<div id="content">\r
29<div id="preamble">\r
30<div class="sectionbody">\r
31<div class="paragraph"><p>In region-based memory management, the heap is divided into a\r
32collection of regions into which objects are allocated. At compile\r
33time, either in the source program or through automatic inference,\r
34allocation points are annotated with the region in which the\r
35allocation will occur. Typically, although not always, the regions\r
36are allocated and deallocated according to a stack discipline.</p></div>\r
37<div class="paragraph"><p>MLton does not use region-based memory management; it uses traditional\r
38<a href="GarbageCollection">GarbageCollection</a>. We have considered integrating regions with\r
39MLton, but in our opinion it is far from clear that regions would\r
40provide MLton with improved performance, while they would certainly\r
41add a lot of complexity to the compiler and complicate reasoning about\r
42and achieving <a href="SpaceSafety">SpaceSafety</a>. Region-based memory management and\r
43garbage collection have different strengths and weaknesses; it&#8217;s\r
44pretty easy to come up with programs that do significantly better\r
45under regions than under GC, and vice versa. We believe that it is\r
46the case that common SML idioms tend to work better under GC than\r
47under regions.</p></div>\r
48<div class="paragraph"><p>One common argument for regions is that the region operations can all\r
49be done in (approximately) constant time; therefore, you eliminate GC\r
50pause times, leading to a real-time GC. However, because of space\r
51safety concerns (see below), we believe that region-based memory\r
52management for SML must also include a traditional garbage collector.\r
53Hence, to achieve real-time memory management for MLton/SML, we\r
54believe that it would be both easier and more efficient to implement a\r
55traditional real-time garbage collector than it would be to implement\r
56a region system.</p></div>\r
57</div>\r
58</div>\r
59<div class="sect1">\r
60<h2 id="_regions_the_ml_kit_and_space_safety">Regions, the ML Kit, and space safety</h2>\r
61<div class="sectionbody">\r
62<div class="paragraph"><p>The <a href="MLKit">ML Kit</a> pioneered the use of regions for compiling\r
63Standard ML. The ML Kit maintains a stack of regions at run time. At\r
64compile time, it uses region inference to decide when data can be\r
65allocated in a stack-like manner, assigning it to an appropriate\r
66region. The ML Kit has put a lot of effort into improving the\r
67supporting analyses and representations of regions, which are all\r
68necessary to improve the performance.</p></div>\r
69<div class="paragraph"><p>Unfortunately, under a pure stack-based region system, space leaks are\r
70inevitable in theory, and costly in practice. Data for which region\r
71inference can not determine the lifetime is moved into the "global\r
72region" whose lifetime is the entire program. There are two ways in\r
73which region inference will place an object to the global region.</p></div>\r
74<div class="ulist"><ul>\r
75<li>\r
76<p>\r
77When the inference is too conservative, that is, when the data is\r
78used in a stack-like manner but the region inference can&#8217;t figure it\r
79out.\r
80</p>\r
81</li>\r
82<li>\r
83<p>\r
84When data is not used in a stack-like manner. In this case,\r
85correctness requires region inference to place the object\r
86</p>\r
87</li>\r
88</ul></div>\r
89<div class="paragraph"><p>This global region is a source of space leaks. No matter what region\r
90system you use, there are some programs such that the global region\r
91must exist, and its size will grow to an unbounded multiple of the\r
92live data size. For these programs one must have a GC to achieve\r
93space safety.</p></div>\r
94<div class="paragraph"><p>To solve this problem, the ML Kit has undergone work to combine\r
95garbage collection with region-based memory management.\r
96<a href="References#HallenbergEtAl02">HallenbergEtAl02</a> and <a href="References#Elsman03">Elsman03</a> describe the addition\r
97of a garbage collector to the ML Kit&#8217;s region-based system. These\r
98papers provide convincing evidence for space leaks in the global\r
99region. They show a number of benchmarks where the memory usage of\r
100the program running with just regions is a large multiple (2, 10, 50,\r
101even 150) of the program running with regions plus GC.</p></div>\r
102<div class="paragraph"><p>These papers also give some numbers to show the ML Kit with just\r
103regions does better than either a system with just GC or a combined\r
104system. Unfortunately, a pure region system isn&#8217;t practical because\r
105of the lack of space safety. And the other performance numbers are\r
106not so convincing, because they compare to an old version of SML/NJ\r
107and not at all with MLton. It would be interesting to see a\r
108comparison with a more serious collector.</p></div>\r
109</div>\r
110</div>\r
111<div class="sect1">\r
112<h2 id="_regions_garbage_collection_and_cyclone">Regions, Garbage Collection, and Cyclone</h2>\r
113<div class="sectionbody">\r
114<div class="paragraph"><p>One possibility is to take Cyclone&#8217;s approach, and provide both\r
115region-based memory management and garbage collection, but at the\r
116programmer&#8217;s option (<a href="References#GrossmanEtAl02">GrossmanEtAl02</a>, <a href="References#HicksEtAl03">HicksEtAl03</a>).</p></div>\r
117<div class="paragraph"><p>One might ask whether we might do the same thing&#8201;&#8212;&#8201;i.e., provide a\r
118<span class="monospaced">MLton.Regions</span> structure with explicit region based memory\r
119management operations, so that the programmer could use them when\r
120appropriate. <a href="MatthewFluet">MatthewFluet</a> has thought about this question</p></div>\r
121<div class="ulist"><ul>\r
122<li>\r
123<p>\r
124<a href="http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html">http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html</a>\r
125</p>\r
126</li>\r
127</ul></div>\r
128<div class="paragraph"><p>Unfortunately, his conclusion is that the SML type system is too weak\r
129to support this option, although there might be a "poor-man&#8217;s" version\r
130with dynamic checks.</p></div>\r
131</div>\r
132</div>\r
133</div>\r
134<div id="footnotes"><hr></div>\r
135<div id="footer">\r
136<div id="footer-text">\r
137</div>\r
138<div id="footer-badges">\r
139</div>\r
140</div>\r
141</body>\r
142</html>\r