| 1 | <!DOCTYPE html>\r |
| 2 | <html lang="en">\r |
| 3 | <head>\r |
| 4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r |
| 5 | <meta name="generator" content="AsciiDoc 8.6.9">\r |
| 6 | <title>Regions</title>\r |
| 7 | <link rel="stylesheet" href="./asciidoc.css" type="text/css">\r |
| 8 | <link rel="stylesheet" href="./pygments.css" type="text/css">\r |
| 9 | \r |
| 10 | \r |
| 11 | <script type="text/javascript" src="./asciidoc.js"></script>\r |
| 12 | <script type="text/javascript">\r |
| 13 | /*<![CDATA[*/\r |
| 14 | asciidoc.install();\r |
| 15 | /*]]>*/\r |
| 16 | </script>\r |
| 17 | <link rel="stylesheet" href="./mlton.css" type="text/css">\r |
| 18 | </head>\r |
| 19 | <body class="article">\r |
| 20 | <div id="banner">\r |
| 21 | <div id="banner-home">\r |
| 22 | <a href="./Home">MLton 20180207</a>\r |
| 23 | </div>\r |
| 24 | </div>\r |
| 25 | <div id="header">\r |
| 26 | <h1>Regions</h1>\r |
| 27 | </div>\r |
| 28 | <div id="content">\r |
| 29 | <div id="preamble">\r |
| 30 | <div class="sectionbody">\r |
| 31 | <div class="paragraph"><p>In region-based memory management, the heap is divided into a\r |
| 32 | collection of regions into which objects are allocated. At compile\r |
| 33 | time, either in the source program or through automatic inference,\r |
| 34 | allocation points are annotated with the region in which the\r |
| 35 | allocation will occur. Typically, although not always, the regions\r |
| 36 | are allocated and deallocated according to a stack discipline.</p></div>\r |
| 37 | <div class="paragraph"><p>MLton does not use region-based memory management; it uses traditional\r |
| 38 | <a href="GarbageCollection">GarbageCollection</a>. We have considered integrating regions with\r |
| 39 | MLton, but in our opinion it is far from clear that regions would\r |
| 40 | provide MLton with improved performance, while they would certainly\r |
| 41 | add a lot of complexity to the compiler and complicate reasoning about\r |
| 42 | and achieving <a href="SpaceSafety">SpaceSafety</a>. Region-based memory management and\r |
| 43 | garbage collection have different strengths and weaknesses; it’s\r |
| 44 | pretty easy to come up with programs that do significantly better\r |
| 45 | under regions than under GC, and vice versa. We believe that it is\r |
| 46 | the case that common SML idioms tend to work better under GC than\r |
| 47 | under regions.</p></div>\r |
| 48 | <div class="paragraph"><p>One common argument for regions is that the region operations can all\r |
| 49 | be done in (approximately) constant time; therefore, you eliminate GC\r |
| 50 | pause times, leading to a real-time GC. However, because of space\r |
| 51 | safety concerns (see below), we believe that region-based memory\r |
| 52 | management for SML must also include a traditional garbage collector.\r |
| 53 | Hence, to achieve real-time memory management for MLton/SML, we\r |
| 54 | believe that it would be both easier and more efficient to implement a\r |
| 55 | traditional real-time garbage collector than it would be to implement\r |
| 56 | a region system.</p></div>\r |
| 57 | </div>\r |
| 58 | </div>\r |
| 59 | <div class="sect1">\r |
| 60 | <h2 id="_regions_the_ml_kit_and_space_safety">Regions, the ML Kit, and space safety</h2>\r |
| 61 | <div class="sectionbody">\r |
| 62 | <div class="paragraph"><p>The <a href="MLKit">ML Kit</a> pioneered the use of regions for compiling\r |
| 63 | Standard ML. The ML Kit maintains a stack of regions at run time. At\r |
| 64 | compile time, it uses region inference to decide when data can be\r |
| 65 | allocated in a stack-like manner, assigning it to an appropriate\r |
| 66 | region. The ML Kit has put a lot of effort into improving the\r |
| 67 | supporting analyses and representations of regions, which are all\r |
| 68 | necessary to improve the performance.</p></div>\r |
| 69 | <div class="paragraph"><p>Unfortunately, under a pure stack-based region system, space leaks are\r |
| 70 | inevitable in theory, and costly in practice. Data for which region\r |
| 71 | inference can not determine the lifetime is moved into the "global\r |
| 72 | region" whose lifetime is the entire program. There are two ways in\r |
| 73 | which region inference will place an object to the global region.</p></div>\r |
| 74 | <div class="ulist"><ul>\r |
| 75 | <li>\r |
| 76 | <p>\r |
| 77 | When the inference is too conservative, that is, when the data is\r |
| 78 | used in a stack-like manner but the region inference can’t figure it\r |
| 79 | out.\r |
| 80 | </p>\r |
| 81 | </li>\r |
| 82 | <li>\r |
| 83 | <p>\r |
| 84 | When data is not used in a stack-like manner. In this case,\r |
| 85 | correctness requires region inference to place the object\r |
| 86 | </p>\r |
| 87 | </li>\r |
| 88 | </ul></div>\r |
| 89 | <div class="paragraph"><p>This global region is a source of space leaks. No matter what region\r |
| 90 | system you use, there are some programs such that the global region\r |
| 91 | must exist, and its size will grow to an unbounded multiple of the\r |
| 92 | live data size. For these programs one must have a GC to achieve\r |
| 93 | space safety.</p></div>\r |
| 94 | <div class="paragraph"><p>To solve this problem, the ML Kit has undergone work to combine\r |
| 95 | garbage collection with region-based memory management.\r |
| 96 | <a href="References#HallenbergEtAl02">HallenbergEtAl02</a> and <a href="References#Elsman03">Elsman03</a> describe the addition\r |
| 97 | of a garbage collector to the ML Kit’s region-based system. These\r |
| 98 | papers provide convincing evidence for space leaks in the global\r |
| 99 | region. They show a number of benchmarks where the memory usage of\r |
| 100 | the program running with just regions is a large multiple (2, 10, 50,\r |
| 101 | even 150) of the program running with regions plus GC.</p></div>\r |
| 102 | <div class="paragraph"><p>These papers also give some numbers to show the ML Kit with just\r |
| 103 | regions does better than either a system with just GC or a combined\r |
| 104 | system. Unfortunately, a pure region system isn’t practical because\r |
| 105 | of the lack of space safety. And the other performance numbers are\r |
| 106 | not so convincing, because they compare to an old version of SML/NJ\r |
| 107 | and not at all with MLton. It would be interesting to see a\r |
| 108 | comparison with a more serious collector.</p></div>\r |
| 109 | </div>\r |
| 110 | </div>\r |
| 111 | <div class="sect1">\r |
| 112 | <h2 id="_regions_garbage_collection_and_cyclone">Regions, Garbage Collection, and Cyclone</h2>\r |
| 113 | <div class="sectionbody">\r |
| 114 | <div class="paragraph"><p>One possibility is to take Cyclone’s approach, and provide both\r |
| 115 | region-based memory management and garbage collection, but at the\r |
| 116 | programmer’s option (<a href="References#GrossmanEtAl02">GrossmanEtAl02</a>, <a href="References#HicksEtAl03">HicksEtAl03</a>).</p></div>\r |
| 117 | <div class="paragraph"><p>One might ask whether we might do the same thing — i.e., provide a\r |
| 118 | <span class="monospaced">MLton.Regions</span> structure with explicit region based memory\r |
| 119 | management operations, so that the programmer could use them when\r |
| 120 | appropriate. <a href="MatthewFluet">MatthewFluet</a> has thought about this question</p></div>\r |
| 121 | <div class="ulist"><ul>\r |
| 122 | <li>\r |
| 123 | <p>\r |
| 124 | <a href="http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html">http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html</a>\r |
| 125 | </p>\r |
| 126 | </li>\r |
| 127 | </ul></div>\r |
| 128 | <div class="paragraph"><p>Unfortunately, his conclusion is that the SML type system is too weak\r |
| 129 | to support this option, although there might be a "poor-man’s" version\r |
| 130 | with dynamic checks.</p></div>\r |
| 131 | </div>\r |
| 132 | </div>\r |
| 133 | </div>\r |
| 134 | <div id="footnotes"><hr></div>\r |
| 135 | <div id="footer">\r |
| 136 | <div id="footer-text">\r |
| 137 | </div>\r |
| 138 | <div id="footer-badges">\r |
| 139 | </div>\r |
| 140 | </div>\r |
| 141 | </body>\r |
| 142 | </html>\r |