Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | <!DOCTYPE html>\r |
2 | <html lang="en">\r | |
3 | <head>\r | |
4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r | |
5 | <meta name="generator" content="AsciiDoc 8.6.9">\r | |
6 | <title>Regions</title>\r | |
7 | <link rel="stylesheet" href="./asciidoc.css" type="text/css">\r | |
8 | <link rel="stylesheet" href="./pygments.css" type="text/css">\r | |
9 | \r | |
10 | \r | |
11 | <script type="text/javascript" src="./asciidoc.js"></script>\r | |
12 | <script type="text/javascript">\r | |
13 | /*<![CDATA[*/\r | |
14 | asciidoc.install();\r | |
15 | /*]]>*/\r | |
16 | </script>\r | |
17 | <link rel="stylesheet" href="./mlton.css" type="text/css">\r | |
18 | </head>\r | |
19 | <body class="article">\r | |
20 | <div id="banner">\r | |
21 | <div id="banner-home">\r | |
22 | <a href="./Home">MLton 20180207</a>\r | |
23 | </div>\r | |
24 | </div>\r | |
25 | <div id="header">\r | |
26 | <h1>Regions</h1>\r | |
27 | </div>\r | |
28 | <div id="content">\r | |
29 | <div id="preamble">\r | |
30 | <div class="sectionbody">\r | |
31 | <div class="paragraph"><p>In region-based memory management, the heap is divided into a\r | |
32 | collection of regions into which objects are allocated. At compile\r | |
33 | time, either in the source program or through automatic inference,\r | |
34 | allocation points are annotated with the region in which the\r | |
35 | allocation will occur. Typically, although not always, the regions\r | |
36 | are allocated and deallocated according to a stack discipline.</p></div>\r | |
37 | <div class="paragraph"><p>MLton does not use region-based memory management; it uses traditional\r | |
38 | <a href="GarbageCollection">GarbageCollection</a>. We have considered integrating regions with\r | |
39 | MLton, but in our opinion it is far from clear that regions would\r | |
40 | provide MLton with improved performance, while they would certainly\r | |
41 | add a lot of complexity to the compiler and complicate reasoning about\r | |
42 | and achieving <a href="SpaceSafety">SpaceSafety</a>. Region-based memory management and\r | |
43 | garbage collection have different strengths and weaknesses; it’s\r | |
44 | pretty easy to come up with programs that do significantly better\r | |
45 | under regions than under GC, and vice versa. We believe that it is\r | |
46 | the case that common SML idioms tend to work better under GC than\r | |
47 | under regions.</p></div>\r | |
48 | <div class="paragraph"><p>One common argument for regions is that the region operations can all\r | |
49 | be done in (approximately) constant time; therefore, you eliminate GC\r | |
50 | pause times, leading to a real-time GC. However, because of space\r | |
51 | safety concerns (see below), we believe that region-based memory\r | |
52 | management for SML must also include a traditional garbage collector.\r | |
53 | Hence, to achieve real-time memory management for MLton/SML, we\r | |
54 | believe that it would be both easier and more efficient to implement a\r | |
55 | traditional real-time garbage collector than it would be to implement\r | |
56 | a region system.</p></div>\r | |
57 | </div>\r | |
58 | </div>\r | |
59 | <div class="sect1">\r | |
60 | <h2 id="_regions_the_ml_kit_and_space_safety">Regions, the ML Kit, and space safety</h2>\r | |
61 | <div class="sectionbody">\r | |
62 | <div class="paragraph"><p>The <a href="MLKit">ML Kit</a> pioneered the use of regions for compiling\r | |
63 | Standard ML. The ML Kit maintains a stack of regions at run time. At\r | |
64 | compile time, it uses region inference to decide when data can be\r | |
65 | allocated in a stack-like manner, assigning it to an appropriate\r | |
66 | region. The ML Kit has put a lot of effort into improving the\r | |
67 | supporting analyses and representations of regions, which are all\r | |
68 | necessary to improve the performance.</p></div>\r | |
69 | <div class="paragraph"><p>Unfortunately, under a pure stack-based region system, space leaks are\r | |
70 | inevitable in theory, and costly in practice. Data for which region\r | |
71 | inference can not determine the lifetime is moved into the "global\r | |
72 | region" whose lifetime is the entire program. There are two ways in\r | |
73 | which region inference will place an object to the global region.</p></div>\r | |
74 | <div class="ulist"><ul>\r | |
75 | <li>\r | |
76 | <p>\r | |
77 | When the inference is too conservative, that is, when the data is\r | |
78 | used in a stack-like manner but the region inference can’t figure it\r | |
79 | out.\r | |
80 | </p>\r | |
81 | </li>\r | |
82 | <li>\r | |
83 | <p>\r | |
84 | When data is not used in a stack-like manner. In this case,\r | |
85 | correctness requires region inference to place the object\r | |
86 | </p>\r | |
87 | </li>\r | |
88 | </ul></div>\r | |
89 | <div class="paragraph"><p>This global region is a source of space leaks. No matter what region\r | |
90 | system you use, there are some programs such that the global region\r | |
91 | must exist, and its size will grow to an unbounded multiple of the\r | |
92 | live data size. For these programs one must have a GC to achieve\r | |
93 | space safety.</p></div>\r | |
94 | <div class="paragraph"><p>To solve this problem, the ML Kit has undergone work to combine\r | |
95 | garbage collection with region-based memory management.\r | |
96 | <a href="References#HallenbergEtAl02">HallenbergEtAl02</a> and <a href="References#Elsman03">Elsman03</a> describe the addition\r | |
97 | of a garbage collector to the ML Kit’s region-based system. These\r | |
98 | papers provide convincing evidence for space leaks in the global\r | |
99 | region. They show a number of benchmarks where the memory usage of\r | |
100 | the program running with just regions is a large multiple (2, 10, 50,\r | |
101 | even 150) of the program running with regions plus GC.</p></div>\r | |
102 | <div class="paragraph"><p>These papers also give some numbers to show the ML Kit with just\r | |
103 | regions does better than either a system with just GC or a combined\r | |
104 | system. Unfortunately, a pure region system isn’t practical because\r | |
105 | of the lack of space safety. And the other performance numbers are\r | |
106 | not so convincing, because they compare to an old version of SML/NJ\r | |
107 | and not at all with MLton. It would be interesting to see a\r | |
108 | comparison with a more serious collector.</p></div>\r | |
109 | </div>\r | |
110 | </div>\r | |
111 | <div class="sect1">\r | |
112 | <h2 id="_regions_garbage_collection_and_cyclone">Regions, Garbage Collection, and Cyclone</h2>\r | |
113 | <div class="sectionbody">\r | |
114 | <div class="paragraph"><p>One possibility is to take Cyclone’s approach, and provide both\r | |
115 | region-based memory management and garbage collection, but at the\r | |
116 | programmer’s option (<a href="References#GrossmanEtAl02">GrossmanEtAl02</a>, <a href="References#HicksEtAl03">HicksEtAl03</a>).</p></div>\r | |
117 | <div class="paragraph"><p>One might ask whether we might do the same thing — i.e., provide a\r | |
118 | <span class="monospaced">MLton.Regions</span> structure with explicit region based memory\r | |
119 | management operations, so that the programmer could use them when\r | |
120 | appropriate. <a href="MatthewFluet">MatthewFluet</a> has thought about this question</p></div>\r | |
121 | <div class="ulist"><ul>\r | |
122 | <li>\r | |
123 | <p>\r | |
124 | <a href="http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html">http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html</a>\r | |
125 | </p>\r | |
126 | </li>\r | |
127 | </ul></div>\r | |
128 | <div class="paragraph"><p>Unfortunately, his conclusion is that the SML type system is too weak\r | |
129 | to support this option, although there might be a "poor-man’s" version\r | |
130 | with dynamic checks.</p></div>\r | |
131 | </div>\r | |
132 | </div>\r | |
133 | </div>\r | |
134 | <div id="footnotes"><hr></div>\r | |
135 | <div id="footer">\r | |
136 | <div id="footer-text">\r | |
137 | </div>\r | |
138 | <div id="footer-badges">\r | |
139 | </div>\r | |
140 | </div>\r | |
141 | </body>\r | |
142 | </html>\r |