Backport from sid to buster
[hcoop/debian/mlton.git] / doc / guide / localhost / GoogleSummerOfCode2015
CommitLineData
7f918cf1
CE
1<!DOCTYPE html>\r
2<html lang="en">\r
3<head>\r
4<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r
5<meta name="generator" content="AsciiDoc 8.6.9">\r
6<title>Google Summer of Code (2015)</title>\r
7<link rel="stylesheet" href="./asciidoc.css" type="text/css">\r
8<link rel="stylesheet" href="./pygments.css" type="text/css">\r
9\r
10\r
11<script type="text/javascript" src="./asciidoc.js"></script>\r
12<script type="text/javascript">\r
13/*<![CDATA[*/\r
14asciidoc.install(2);\r
15/*]]>*/\r
16</script>\r
17<link rel="stylesheet" href="./mlton.css" type="text/css">\r
18</head>\r
19<body class="article">\r
20<div id="banner">\r
21<div id="banner-home">\r
22<a href="./Home">MLton 20180207</a>\r
23</div>\r
24</div>\r
25<div id="header">\r
26<h1>Google Summer of Code (2015)</h1>\r
27<div id="toc">
28 <div id="toctitle">Table of Contents</div>
29 <noscript><p><b>JavaScript must be enabled in your browser to display the table of contents.</b></p></noscript>
30</div>\r
31</div>\r
32<div id="content">\r
33<div class="sect1">\r
34<h2 id="_mentors">Mentors</h2>\r
35<div class="sectionbody">\r
36<div class="paragraph"><p>The following developers have agreed to serve as mentors for the 2015 Google Summer of Code:</p></div>\r
37<div class="ulist"><ul>\r
38<li>\r
39<p>\r
40<a href="http://www.cs.rit.edu/%7Emtf">Matthew Fluet</a>\r
41</p>\r
42</li>\r
43<li>\r
44<p>\r
45<a href="http://www.cse.buffalo.edu/%7Elziarek/">Lukasz (Luke) Ziarek</a>\r
46</p>\r
47</li>\r
48</ul></div>\r
49</div>\r
50</div>\r
51<div class="sect1">\r
52<h2 id="_ideas_list">Ideas List</h2>\r
53<div class="sectionbody">\r
54<div class="sect2">\r
55<h3 id="_design_and_implement_a_heap_profiler">Design and Implement a Heap Profiler</h3>\r
56<div class="paragraph"><p>A heap profile is a description of the space usage of a program. A\r
57heap profile is concerned with the allocation, retention, and\r
58deallocation (via garbage collection) of heap data during the\r
59execution of a program. A heap profile can be used to diagnose\r
60performance problems in a functional program that arise from space\r
61leaks. This project aims to design and implement a heap profiler for\r
62MLton compiled programs.</p></div>\r
63<div class="paragraph"><p>Background:</p></div>\r
64<div class="openblock">\r
65<div class="content">\r
66<div class="ulist"><ul>\r
67<li>\r
68<p>\r
69<a href="http://portal.acm.org/citation.cfm?doid=583854.582451">GCspy: an adaptable heap visualisation framework</a>; Tony Printezis and Richard Jones\r
70</p>\r
71</li>\r
72<li>\r
73<p>\r
74<a href="http://journals.cambridge.org/action/displayAbstract?aid=1349892">New dimensions in heap profiling</a>; Colin Runciman and Niklas R&ouml;jemo\r
75</p>\r
76</li>\r
77<li>\r
78<p>\r
79<a href="http://www.springerlink.com/content/710501660722gw37/">Heap profiling for space efficiency</a>; Colin Runciman and Niklas R&ouml;jemo\r
80</p>\r
81</li>\r
82<li>\r
83<p>\r
84<a href="http://journals.cambridge.org/action/displayAbstract?aid=1323096">Heap profiling of lazy functional programs</a>; Colin Runciman and David Wakeling\r
85</p>\r
86</li>\r
87</ul></div>\r
88</div></div>\r
89<div class="paragraph"><p>Recommended Skills: C and SML programming experience; some experience with UI and visualization</p></div>\r
90</div>\r
91<div class="sect2">\r
92<h3 id="_garbage_collector_improvements">Garbage Collector Improvements</h3>\r
93<div class="paragraph"><p>The garbage collector plays a significant role in the performance of\r
94functional languages. Garbage collect too often, and program\r
95performance suffers due to the excessive time spent in the garbage\r
96collector. Garbage collect not often enough, and program performance\r
97suffers due to the excessive space used by the uncollected\r
98garbage. One particular issue is ensuring that a program utilizing a\r
99garbage collector "plays nice" with other processes on the system, by\r
100not using too much or too little physical memory. While there are some\r
101reasonable theoretical results about garbage collections with heaps of\r
102fixed size, there seems to be insufficient work that really looks\r
103carefully at the question of dynamically resizing the heap in response\r
104to the live data demands of the application and, similarly, in\r
105response to the behavior of the operating system and other\r
106processes. This project aims to investigate improvements to the memory\r
107behavior of MLton compiled programs through better tuning of the\r
108garbage collector.</p></div>\r
109<div class="paragraph"><p>Background:</p></div>\r
110<div class="openblock">\r
111<div class="content">\r
112<div class="ulist"><ul>\r
113<li>\r
114<p>\r
115<a href="http://gchandbook.org/">The Garbage Collection Handbook: The Art of Automatic Memory Management</a>; Richard Jones, Antony Hosking, Eliot Moss\r
116</p>\r
117</li>\r
118<li>\r
119<p>\r
120<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020">Dual-Mode Garbage Collection</a>; Patrick Sansom\r
121</p>\r
122</li>\r
123<li>\r
124<p>\r
125<a href="http://portal.acm.org/citation.cfm?doid=1029873.1029881">Automatic Heap Sizing: Taking Real Memory into Account</a>; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss\r
126</p>\r
127</li>\r
128<li>\r
129<p>\r
130<a href="http://portal.acm.org/citation.cfm?doid=1152649.1152652">Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications</a>; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham\r
131</p>\r
132</li>\r
133<li>\r
134<p>\r
135<a href="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125">Isla Vista Heap Sizing: Using Feedback to Avoid Paging</a>; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski\r
136</p>\r
137</li>\r
138<li>\r
139<p>\r
140<a href="http://portal.acm.org/citation.cfm?doid=1806651.1806669">The Economics of Garbage Collection</a>; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján\r
141</p>\r
142</li>\r
143<li>\r
144<p>\r
145<a href="http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf">Automated Heap Sizing in the Poly/ML Runtime (Position Paper)</a>; David White, Jeremy Singer, Jonathan Aitken, and David Matthews\r
146</p>\r
147</li>\r
148<li>\r
149<p>\r
150<a href="http://portal.acm.org/citation.cfm?doid=2555670.2466481">Control Theory for Principled Heap Sizing</a>; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones\r
151</p>\r
152</li>\r
153</ul></div>\r
154</div></div>\r
155<div class="paragraph"><p>Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience</p></div>\r
156</div>\r
157<div class="sect2">\r
158<h3 id="_heap_allocated_activation_records">Heap-allocated Activation Records</h3>\r
159<div class="paragraph"><p>Activation records (a.k.a., stack frames) are traditionally allocated\r
160on a stack. This naturally corresponds to the call-return pattern of\r
161function invocation. However, there are some disadvantages to\r
162stack-allocated activation records. In a functional programming\r
163language, functions may be deeply recursive, resulting in call stacks\r
164that are much larger than typically supported by the operating system;\r
165hence, a functional programming language implementation will typically\r
166store its stack in its heap. Furthermore, a functional programming\r
167language implementation must handle and recover from stack overflow,\r
168by allocating a larger stack (again, in its heap) and copying\r
169activation records from the old stack to the new stack. In the\r
170presence of threads, stacks must be allocated in a heap and, in the\r
171presence of a garbage collector, should be garbage collected when\r
172unreachable. While heap-allocated activation records avoid many of\r
173these disadvantages, they have not been widely implemented. This\r
174project aims to implement and evaluate heap-allocated activation\r
175records in the MLton compiler.</p></div>\r
176<div class="paragraph"><p>Background:</p></div>\r
177<div class="openblock">\r
178<div class="content">\r
179<div class="ulist"><ul>\r
180<li>\r
181<p>\r
182<a href="http://journals.cambridge.org/action/displayAbstract?aid=1295104">Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures</a>; Andrew W. Appel and Zhong Shao\r
183</p>\r
184</li>\r
185<li>\r
186<p>\r
187<a href="http://portal.acm.org/citation.cfm?doid=182590.156783">Space-efficient closure representations</a>; Zhong Shao and Andrew W. Appel\r
188</p>\r
189</li>\r
190<li>\r
191<p>\r
192<a href="http://portal.acm.org/citation.cfm?doid=93548.93554">Representing control in the presence of first-class continuations</a>; R. Hieb, R. Kent Dybvig, and Carl Bruggeman\r
193</p>\r
194</li>\r
195</ul></div>\r
196</div></div>\r
197<div class="paragraph"><p>Recommended Skills: SML programming experience; some middle- and back-end compiler experience</p></div>\r
198</div>\r
199<div class="sect2">\r
200<h3 id="_correctly_rounded_floating_point_binary_to_decimal_and_decimal_to_binary_conversion_routines_in_standard_ml">Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML</h3>\r
201<div class="paragraph"><p>The\r
202<a href="http://en.wikipedia.org/wiki/IEEE_754-2008">IEEE Standard for Floating-Point Arithmetic (IEEE 754)</a>\r
203is the de facto representation for floating-point computation.\r
204However, it is a <em>binary</em> (base 2) representation of floating-point\r
205values, while many applications call for input and output of\r
206floating-point values in <em>decimal</em> (base 10) representation. The\r
207<em>decimal-to-binary</em> conversion problem takes a decimal floating-point\r
208representation (e.g., a string like <span class="monospaced">"0.1"</span>) and returns the best\r
209binary floating-point representation of that number. The\r
210<em>binary-to-decimal</em> conversion problem takes a binary floating-point\r
211representation and returns a decimal floating-point representation\r
212using the smallest number of digits that allow the decimal\r
213floating-point representation to be converted to the original binary\r
214floating-point representation. For both conversion routines, "best"\r
215is dependent upon the current floating-point rounding mode.</p></div>\r
216<div class="paragraph"><p>MLton uses David Gay&#8217;s\r
217<a href="http://www.netlib.org/fp/gdtoa.tgz">gdtoa library</a> for floating-point\r
218conversions. While this is an exellent library, it generalizes the\r
219decimal-to-binary and binary-to-decimal conversion routines beyond\r
220what is required by the\r
221<a href="http://standardml.org/Basis/">Standard ML Basis Library</a> and induces an\r
222external dependency on the compiler. Native implementations of these\r
223conversion routines in Standard ML would obviate the dependency on the\r
224<span class="monospaced">gdtoa</span> library, while also being able to take advantage of Standard\r
225ML features in the implementation (e.g., the published algorithms\r
226often require use of infinite precision arithmetic, which is provided\r
227by the <span class="monospaced">IntInf</span> structure in Standard ML, but is provided in an ad hoc\r
228fasion in the <span class="monospaced">gdtoa</span> library).</p></div>\r
229<div class="paragraph"><p>This project aims to develop a native implementation of the conversion\r
230routines in Standard ML.</p></div>\r
231<div class="paragraph"><p>Background:</p></div>\r
232<div class="openblock">\r
233<div class="content">\r
234<div class="ulist"><ul>\r
235<li>\r
236<p>\r
237<a href="http://dl.acm.org/citation.cfm?doid=103162.103163">What every computer scientist should know about floating-point arithmetic</a>; David Goldberg\r
238</p>\r
239</li>\r
240<li>\r
241<p>\r
242<a href="http://dl.acm.org/citation.cfm?doid=93542.93559">How to print floating-point numbers accurately</a>; Guy L. Steele, Jr. and Jon L. White\r
243</p>\r
244</li>\r
245<li>\r
246<p>\r
247<a href="http://dl.acm.org/citation.cfm?doid=93542.93557">How to read floating point numbers accurately</a>; William D. Clinger\r
248</p>\r
249</li>\r
250<li>\r
251<p>\r
252<a href="http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz">Correctly Rounded Binary-Decimal and Decimal-Binary Conversions</a>; David Gay\r
253</p>\r
254</li>\r
255<li>\r
256<p>\r
257<a href="http://dl.acm.org/citation.cfm?doid=249069.231397">Printing floating-point numbers quickly and accurately</a>; Robert G. Burger and R. Kent Dybvig\r
258</p>\r
259</li>\r
260<li>\r
261<p>\r
262<a href="http://dl.acm.org/citation.cfm?doid=1806596.1806623">Printing floating-point numbers quickly and accurately with integers</a>; Florian Loitsch\r
263</p>\r
264</li>\r
265</ul></div>\r
266</div></div>\r
267<div class="paragraph"><p>Recommended Skills: SML programming experience; algorithm design and implementation</p></div>\r
268</div>\r
269<div class="sect2">\r
270<h3 id="_implement_source_level_debugging">Implement Source-level Debugging</h3>\r
271<div class="paragraph"><p>Debugging is a fact of programming life. Unfortunately, most SML\r
272implementations (including MLton) provide little to no source-level\r
273debugging support. This project aims to add basic to intermediate\r
274source-level debugging support to the MLton compiler. MLton already\r
275supports source-level profiling, which can be used to attribute bytes\r
276allocated or time spent in source functions. It should be relatively\r
277straightforward to leverage this source-level information into basic\r
278source-level debugging support, with the ability to set/unset\r
279breakpoints and step through declarations and functions. It may be\r
280possible to also provide intermediate source-level debugging support,\r
281with the ability to inspect in-scope variables of basic types (e.g.,\r
282types compatible with MLton&#8217;s foreign function interface).</p></div>\r
283<div class="paragraph"><p>Background:</p></div>\r
284<div class="openblock">\r
285<div class="content">\r
286<div class="ulist"><ul>\r
287<li>\r
288<p>\r
289<a href="http://mlton.org/HowProfilingWorks">MLton&#8201;&#8212;&#8201;How Profiling Works</a>\r
290</p>\r
291</li>\r
292<li>\r
293<p>\r
294<a href="http://mlton.org/ForeignFunctionInterfaceTypes">MLton&#8201;&#8212;&#8201;Foreign Function Interface Types</a>\r
295</p>\r
296</li>\r
297<li>\r
298<p>\r
299<a href="http://dwarfstd.org/">DWARF Debugging Standard</a>\r
300</p>\r
301</li>\r
302<li>\r
303<p>\r
304<a href="http://sourceware.org/gdb/current/onlinedocs/stabs/index.html">STABS Debugging Format</a>\r
305</p>\r
306</li>\r
307</ul></div>\r
308</div></div>\r
309<div class="paragraph"><p>Recommended Skills: SML programming experience; some compiler experience</p></div>\r
310</div>\r
311<div class="sect2">\r
312<h3 id="_region_based_memory_management">Region Based Memory Management</h3>\r
313<div class="paragraph"><p>Region based memory management is an alternative automatic memory\r
314management scheme to garbage collection. Regions can be inferred by\r
315the compiler (e.g., Cyclone and MLKit) or provided to the programmer\r
316through a library. Since many students do not have extensive\r
317experience with compilers we plan on adopting the later approach.\r
318Creating a viable region based memory solution requires the removal of\r
319the GC and changes to the allocator. Additionally, write barriers\r
320will be necessary to ensure references between two ML objects is never\r
321established if the left hand side of the assignment has a longer\r
322lifetime than the right hand side. Students will need to come up with\r
323an appropriate interface for creating, entering, and exiting regions\r
324(examples include RTSJ scoped memory and SCJ scoped memory).</p></div>\r
325<div class="paragraph"><p>Background:</p></div>\r
326<div class="openblock">\r
327<div class="content">\r
328<div class="ulist"><ul>\r
329<li>\r
330<p>\r
331Cyclone\r
332</p>\r
333</li>\r
334<li>\r
335<p>\r
336MLKit\r
337</p>\r
338</li>\r
339<li>\r
340<p>\r
341RTSJ + SCJ scopes\r
342</p>\r
343</li>\r
344</ul></div>\r
345</div></div>\r
346<div class="paragraph"><p>Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience</p></div>\r
347</div>\r
348<div class="sect2">\r
349<h3 id="_adding_real_time_capabilities">Adding Real-Time Capabilities</h3>\r
350<div class="paragraph"><p>This project focuses on exposing real-time APIs from a real-time OS\r
351kernel at the SML level. This will require mapping the current MLton\r
352(or <a href="http://multimlton.cs.purdue.edu">MultiMLton</a>) threading framework\r
353to real-time threads that the RTOS provides. This will include\r
354associating priorities with MLton threads and building priority based\r
355scheduling algorithms. Additionally, support for perdioc, aperiodic,\r
356and sporadic tasks should be supported. A real-time SML library will\r
357need to be created to provide a forward facing interface for\r
358programmers. Stretch goals include reworking the MLton <span class="monospaced">atomic</span>\r
359statement and associated synchronization primitives built on top of\r
360the MLton <span class="monospaced">atomic</span> statement.</p></div>\r
361<div class="paragraph"><p>Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required</p></div>\r
362</div>\r
363<div class="sect2">\r
364<h3 id="_real_time_garbage_collection">Real-Time Garbage Collection</h3>\r
365<div class="paragraph"><p>This project focuses on modifications to the MLton GC to support\r
366real-time garbage collection. We will model the real-time GC on the\r
367Schism RTGC. The first task will be to create a fixed size runtime\r
368object representation. Large structures will need to be represented\r
369as a linked lists of fixed sized objects. Arrays and vectors will be\r
370transferred into dense trees. Compaction and copying can therefore be\r
371removed from the GC algorithms that MLton currently supports. Lastly,\r
372the GC will be made concurrent, allowing for the execution of the GC\r
373threads as the lowest priority task in the system. Stretch goals\r
374include a priority aware mechanism for the GC to signal to real-time\r
375ML threads that it needs to scan their stack and identification of\r
376places where the stack is shallow to bound priority inversion during\r
377this procedure.</p></div>\r
378<div class="paragraph"><p>Recommended Skills: C programming experience; garbage collector experience a plus but not required</p></div>\r
379</div>\r
380</div>\r
381</div>\r
382</div>\r
383<div id="footnotes"><hr></div>\r
384<div id="footer">\r
385<div id="footer-text">\r
386</div>\r
387<div id="footer-badges">\r
388</div>\r
389</div>\r
390</body>\r
391</html>\r