| 1 | <!DOCTYPE html>\r |
| 2 | <html lang="en">\r |
| 3 | <head>\r |
| 4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r |
| 5 | <meta name="generator" content="AsciiDoc 8.6.9">\r |
| 6 | <title>Google Summer of Code (2015)</title>\r |
| 7 | <link rel="stylesheet" href="./asciidoc.css" type="text/css">\r |
| 8 | <link rel="stylesheet" href="./pygments.css" type="text/css">\r |
| 9 | \r |
| 10 | \r |
| 11 | <script type="text/javascript" src="./asciidoc.js"></script>\r |
| 12 | <script type="text/javascript">\r |
| 13 | /*<![CDATA[*/\r |
| 14 | asciidoc.install(2);\r |
| 15 | /*]]>*/\r |
| 16 | </script>\r |
| 17 | <link rel="stylesheet" href="./mlton.css" type="text/css">\r |
| 18 | </head>\r |
| 19 | <body class="article">\r |
| 20 | <div id="banner">\r |
| 21 | <div id="banner-home">\r |
| 22 | <a href="./Home">MLton 20180207</a>\r |
| 23 | </div>\r |
| 24 | </div>\r |
| 25 | <div id="header">\r |
| 26 | <h1>Google Summer of Code (2015)</h1>\r |
| 27 | <div id="toc"> |
| 28 | <div id="toctitle">Table of Contents</div> |
| 29 | <noscript><p><b>JavaScript must be enabled in your browser to display the table of contents.</b></p></noscript> |
| 30 | </div>\r |
| 31 | </div>\r |
| 32 | <div id="content">\r |
| 33 | <div class="sect1">\r |
| 34 | <h2 id="_mentors">Mentors</h2>\r |
| 35 | <div class="sectionbody">\r |
| 36 | <div class="paragraph"><p>The following developers have agreed to serve as mentors for the 2015 Google Summer of Code:</p></div>\r |
| 37 | <div class="ulist"><ul>\r |
| 38 | <li>\r |
| 39 | <p>\r |
| 40 | <a href="http://www.cs.rit.edu/%7Emtf">Matthew Fluet</a>\r |
| 41 | </p>\r |
| 42 | </li>\r |
| 43 | <li>\r |
| 44 | <p>\r |
| 45 | <a href="http://www.cse.buffalo.edu/%7Elziarek/">Lukasz (Luke) Ziarek</a>\r |
| 46 | </p>\r |
| 47 | </li>\r |
| 48 | </ul></div>\r |
| 49 | </div>\r |
| 50 | </div>\r |
| 51 | <div class="sect1">\r |
| 52 | <h2 id="_ideas_list">Ideas List</h2>\r |
| 53 | <div class="sectionbody">\r |
| 54 | <div class="sect2">\r |
| 55 | <h3 id="_design_and_implement_a_heap_profiler">Design and Implement a Heap Profiler</h3>\r |
| 56 | <div class="paragraph"><p>A heap profile is a description of the space usage of a program. A\r |
| 57 | heap profile is concerned with the allocation, retention, and\r |
| 58 | deallocation (via garbage collection) of heap data during the\r |
| 59 | execution of a program. A heap profile can be used to diagnose\r |
| 60 | performance problems in a functional program that arise from space\r |
| 61 | leaks. This project aims to design and implement a heap profiler for\r |
| 62 | MLton compiled programs.</p></div>\r |
| 63 | <div class="paragraph"><p>Background:</p></div>\r |
| 64 | <div class="openblock">\r |
| 65 | <div class="content">\r |
| 66 | <div class="ulist"><ul>\r |
| 67 | <li>\r |
| 68 | <p>\r |
| 69 | <a href="http://portal.acm.org/citation.cfm?doid=583854.582451">GCspy: an adaptable heap visualisation framework</a>; Tony Printezis and Richard Jones\r |
| 70 | </p>\r |
| 71 | </li>\r |
| 72 | <li>\r |
| 73 | <p>\r |
| 74 | <a href="http://journals.cambridge.org/action/displayAbstract?aid=1349892">New dimensions in heap profiling</a>; Colin Runciman and Niklas Röjemo\r |
| 75 | </p>\r |
| 76 | </li>\r |
| 77 | <li>\r |
| 78 | <p>\r |
| 79 | <a href="http://www.springerlink.com/content/710501660722gw37/">Heap profiling for space efficiency</a>; Colin Runciman and Niklas Röjemo\r |
| 80 | </p>\r |
| 81 | </li>\r |
| 82 | <li>\r |
| 83 | <p>\r |
| 84 | <a href="http://journals.cambridge.org/action/displayAbstract?aid=1323096">Heap profiling of lazy functional programs</a>; Colin Runciman and David Wakeling\r |
| 85 | </p>\r |
| 86 | </li>\r |
| 87 | </ul></div>\r |
| 88 | </div></div>\r |
| 89 | <div class="paragraph"><p>Recommended Skills: C and SML programming experience; some experience with UI and visualization</p></div>\r |
| 90 | </div>\r |
| 91 | <div class="sect2">\r |
| 92 | <h3 id="_garbage_collector_improvements">Garbage Collector Improvements</h3>\r |
| 93 | <div class="paragraph"><p>The garbage collector plays a significant role in the performance of\r |
| 94 | functional languages. Garbage collect too often, and program\r |
| 95 | performance suffers due to the excessive time spent in the garbage\r |
| 96 | collector. Garbage collect not often enough, and program performance\r |
| 97 | suffers due to the excessive space used by the uncollected\r |
| 98 | garbage. One particular issue is ensuring that a program utilizing a\r |
| 99 | garbage collector "plays nice" with other processes on the system, by\r |
| 100 | not using too much or too little physical memory. While there are some\r |
| 101 | reasonable theoretical results about garbage collections with heaps of\r |
| 102 | fixed size, there seems to be insufficient work that really looks\r |
| 103 | carefully at the question of dynamically resizing the heap in response\r |
| 104 | to the live data demands of the application and, similarly, in\r |
| 105 | response to the behavior of the operating system and other\r |
| 106 | processes. This project aims to investigate improvements to the memory\r |
| 107 | behavior of MLton compiled programs through better tuning of the\r |
| 108 | garbage collector.</p></div>\r |
| 109 | <div class="paragraph"><p>Background:</p></div>\r |
| 110 | <div class="openblock">\r |
| 111 | <div class="content">\r |
| 112 | <div class="ulist"><ul>\r |
| 113 | <li>\r |
| 114 | <p>\r |
| 115 | <a href="http://gchandbook.org/">The Garbage Collection Handbook: The Art of Automatic Memory Management</a>; Richard Jones, Antony Hosking, Eliot Moss\r |
| 116 | </p>\r |
| 117 | </li>\r |
| 118 | <li>\r |
| 119 | <p>\r |
| 120 | <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020">Dual-Mode Garbage Collection</a>; Patrick Sansom\r |
| 121 | </p>\r |
| 122 | </li>\r |
| 123 | <li>\r |
| 124 | <p>\r |
| 125 | <a href="http://portal.acm.org/citation.cfm?doid=1029873.1029881">Automatic Heap Sizing: Taking Real Memory into Account</a>; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss\r |
| 126 | </p>\r |
| 127 | </li>\r |
| 128 | <li>\r |
| 129 | <p>\r |
| 130 | <a href="http://portal.acm.org/citation.cfm?doid=1152649.1152652">Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications</a>; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham\r |
| 131 | </p>\r |
| 132 | </li>\r |
| 133 | <li>\r |
| 134 | <p>\r |
| 135 | <a href="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125">Isla Vista Heap Sizing: Using Feedback to Avoid Paging</a>; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski\r |
| 136 | </p>\r |
| 137 | </li>\r |
| 138 | <li>\r |
| 139 | <p>\r |
| 140 | <a href="http://portal.acm.org/citation.cfm?doid=1806651.1806669">The Economics of Garbage Collection</a>; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján\r |
| 141 | </p>\r |
| 142 | </li>\r |
| 143 | <li>\r |
| 144 | <p>\r |
| 145 | <a href="http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf">Automated Heap Sizing in the Poly/ML Runtime (Position Paper)</a>; David White, Jeremy Singer, Jonathan Aitken, and David Matthews\r |
| 146 | </p>\r |
| 147 | </li>\r |
| 148 | <li>\r |
| 149 | <p>\r |
| 150 | <a href="http://portal.acm.org/citation.cfm?doid=2555670.2466481">Control Theory for Principled Heap Sizing</a>; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones\r |
| 151 | </p>\r |
| 152 | </li>\r |
| 153 | </ul></div>\r |
| 154 | </div></div>\r |
| 155 | <div class="paragraph"><p>Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience</p></div>\r |
| 156 | </div>\r |
| 157 | <div class="sect2">\r |
| 158 | <h3 id="_heap_allocated_activation_records">Heap-allocated Activation Records</h3>\r |
| 159 | <div class="paragraph"><p>Activation records (a.k.a., stack frames) are traditionally allocated\r |
| 160 | on a stack. This naturally corresponds to the call-return pattern of\r |
| 161 | function invocation. However, there are some disadvantages to\r |
| 162 | stack-allocated activation records. In a functional programming\r |
| 163 | language, functions may be deeply recursive, resulting in call stacks\r |
| 164 | that are much larger than typically supported by the operating system;\r |
| 165 | hence, a functional programming language implementation will typically\r |
| 166 | store its stack in its heap. Furthermore, a functional programming\r |
| 167 | language implementation must handle and recover from stack overflow,\r |
| 168 | by allocating a larger stack (again, in its heap) and copying\r |
| 169 | activation records from the old stack to the new stack. In the\r |
| 170 | presence of threads, stacks must be allocated in a heap and, in the\r |
| 171 | presence of a garbage collector, should be garbage collected when\r |
| 172 | unreachable. While heap-allocated activation records avoid many of\r |
| 173 | these disadvantages, they have not been widely implemented. This\r |
| 174 | project aims to implement and evaluate heap-allocated activation\r |
| 175 | records in the MLton compiler.</p></div>\r |
| 176 | <div class="paragraph"><p>Background:</p></div>\r |
| 177 | <div class="openblock">\r |
| 178 | <div class="content">\r |
| 179 | <div class="ulist"><ul>\r |
| 180 | <li>\r |
| 181 | <p>\r |
| 182 | <a href="http://journals.cambridge.org/action/displayAbstract?aid=1295104">Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures</a>; Andrew W. Appel and Zhong Shao\r |
| 183 | </p>\r |
| 184 | </li>\r |
| 185 | <li>\r |
| 186 | <p>\r |
| 187 | <a href="http://portal.acm.org/citation.cfm?doid=182590.156783">Space-efficient closure representations</a>; Zhong Shao and Andrew W. Appel\r |
| 188 | </p>\r |
| 189 | </li>\r |
| 190 | <li>\r |
| 191 | <p>\r |
| 192 | <a href="http://portal.acm.org/citation.cfm?doid=93548.93554">Representing control in the presence of first-class continuations</a>; R. Hieb, R. Kent Dybvig, and Carl Bruggeman\r |
| 193 | </p>\r |
| 194 | </li>\r |
| 195 | </ul></div>\r |
| 196 | </div></div>\r |
| 197 | <div class="paragraph"><p>Recommended Skills: SML programming experience; some middle- and back-end compiler experience</p></div>\r |
| 198 | </div>\r |
| 199 | <div class="sect2">\r |
| 200 | <h3 id="_correctly_rounded_floating_point_binary_to_decimal_and_decimal_to_binary_conversion_routines_in_standard_ml">Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML</h3>\r |
| 201 | <div class="paragraph"><p>The\r |
| 202 | <a href="http://en.wikipedia.org/wiki/IEEE_754-2008">IEEE Standard for Floating-Point Arithmetic (IEEE 754)</a>\r |
| 203 | is the de facto representation for floating-point computation.\r |
| 204 | However, it is a <em>binary</em> (base 2) representation of floating-point\r |
| 205 | values, while many applications call for input and output of\r |
| 206 | floating-point values in <em>decimal</em> (base 10) representation. The\r |
| 207 | <em>decimal-to-binary</em> conversion problem takes a decimal floating-point\r |
| 208 | representation (e.g., a string like <span class="monospaced">"0.1"</span>) and returns the best\r |
| 209 | binary floating-point representation of that number. The\r |
| 210 | <em>binary-to-decimal</em> conversion problem takes a binary floating-point\r |
| 211 | representation and returns a decimal floating-point representation\r |
| 212 | using the smallest number of digits that allow the decimal\r |
| 213 | floating-point representation to be converted to the original binary\r |
| 214 | floating-point representation. For both conversion routines, "best"\r |
| 215 | is dependent upon the current floating-point rounding mode.</p></div>\r |
| 216 | <div class="paragraph"><p>MLton uses David Gay’s\r |
| 217 | <a href="http://www.netlib.org/fp/gdtoa.tgz">gdtoa library</a> for floating-point\r |
| 218 | conversions. While this is an exellent library, it generalizes the\r |
| 219 | decimal-to-binary and binary-to-decimal conversion routines beyond\r |
| 220 | what is required by the\r |
| 221 | <a href="http://standardml.org/Basis/">Standard ML Basis Library</a> and induces an\r |
| 222 | external dependency on the compiler. Native implementations of these\r |
| 223 | conversion routines in Standard ML would obviate the dependency on the\r |
| 224 | <span class="monospaced">gdtoa</span> library, while also being able to take advantage of Standard\r |
| 225 | ML features in the implementation (e.g., the published algorithms\r |
| 226 | often require use of infinite precision arithmetic, which is provided\r |
| 227 | by the <span class="monospaced">IntInf</span> structure in Standard ML, but is provided in an ad hoc\r |
| 228 | fasion in the <span class="monospaced">gdtoa</span> library).</p></div>\r |
| 229 | <div class="paragraph"><p>This project aims to develop a native implementation of the conversion\r |
| 230 | routines in Standard ML.</p></div>\r |
| 231 | <div class="paragraph"><p>Background:</p></div>\r |
| 232 | <div class="openblock">\r |
| 233 | <div class="content">\r |
| 234 | <div class="ulist"><ul>\r |
| 235 | <li>\r |
| 236 | <p>\r |
| 237 | <a href="http://dl.acm.org/citation.cfm?doid=103162.103163">What every computer scientist should know about floating-point arithmetic</a>; David Goldberg\r |
| 238 | </p>\r |
| 239 | </li>\r |
| 240 | <li>\r |
| 241 | <p>\r |
| 242 | <a href="http://dl.acm.org/citation.cfm?doid=93542.93559">How to print floating-point numbers accurately</a>; Guy L. Steele, Jr. and Jon L. White\r |
| 243 | </p>\r |
| 244 | </li>\r |
| 245 | <li>\r |
| 246 | <p>\r |
| 247 | <a href="http://dl.acm.org/citation.cfm?doid=93542.93557">How to read floating point numbers accurately</a>; William D. Clinger\r |
| 248 | </p>\r |
| 249 | </li>\r |
| 250 | <li>\r |
| 251 | <p>\r |
| 252 | <a href="http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz">Correctly Rounded Binary-Decimal and Decimal-Binary Conversions</a>; David Gay\r |
| 253 | </p>\r |
| 254 | </li>\r |
| 255 | <li>\r |
| 256 | <p>\r |
| 257 | <a href="http://dl.acm.org/citation.cfm?doid=249069.231397">Printing floating-point numbers quickly and accurately</a>; Robert G. Burger and R. Kent Dybvig\r |
| 258 | </p>\r |
| 259 | </li>\r |
| 260 | <li>\r |
| 261 | <p>\r |
| 262 | <a href="http://dl.acm.org/citation.cfm?doid=1806596.1806623">Printing floating-point numbers quickly and accurately with integers</a>; Florian Loitsch\r |
| 263 | </p>\r |
| 264 | </li>\r |
| 265 | </ul></div>\r |
| 266 | </div></div>\r |
| 267 | <div class="paragraph"><p>Recommended Skills: SML programming experience; algorithm design and implementation</p></div>\r |
| 268 | </div>\r |
| 269 | <div class="sect2">\r |
| 270 | <h3 id="_implement_source_level_debugging">Implement Source-level Debugging</h3>\r |
| 271 | <div class="paragraph"><p>Debugging is a fact of programming life. Unfortunately, most SML\r |
| 272 | implementations (including MLton) provide little to no source-level\r |
| 273 | debugging support. This project aims to add basic to intermediate\r |
| 274 | source-level debugging support to the MLton compiler. MLton already\r |
| 275 | supports source-level profiling, which can be used to attribute bytes\r |
| 276 | allocated or time spent in source functions. It should be relatively\r |
| 277 | straightforward to leverage this source-level information into basic\r |
| 278 | source-level debugging support, with the ability to set/unset\r |
| 279 | breakpoints and step through declarations and functions. It may be\r |
| 280 | possible to also provide intermediate source-level debugging support,\r |
| 281 | with the ability to inspect in-scope variables of basic types (e.g.,\r |
| 282 | types compatible with MLton’s foreign function interface).</p></div>\r |
| 283 | <div class="paragraph"><p>Background:</p></div>\r |
| 284 | <div class="openblock">\r |
| 285 | <div class="content">\r |
| 286 | <div class="ulist"><ul>\r |
| 287 | <li>\r |
| 288 | <p>\r |
| 289 | <a href="http://mlton.org/HowProfilingWorks">MLton — How Profiling Works</a>\r |
| 290 | </p>\r |
| 291 | </li>\r |
| 292 | <li>\r |
| 293 | <p>\r |
| 294 | <a href="http://mlton.org/ForeignFunctionInterfaceTypes">MLton — Foreign Function Interface Types</a>\r |
| 295 | </p>\r |
| 296 | </li>\r |
| 297 | <li>\r |
| 298 | <p>\r |
| 299 | <a href="http://dwarfstd.org/">DWARF Debugging Standard</a>\r |
| 300 | </p>\r |
| 301 | </li>\r |
| 302 | <li>\r |
| 303 | <p>\r |
| 304 | <a href="http://sourceware.org/gdb/current/onlinedocs/stabs/index.html">STABS Debugging Format</a>\r |
| 305 | </p>\r |
| 306 | </li>\r |
| 307 | </ul></div>\r |
| 308 | </div></div>\r |
| 309 | <div class="paragraph"><p>Recommended Skills: SML programming experience; some compiler experience</p></div>\r |
| 310 | </div>\r |
| 311 | <div class="sect2">\r |
| 312 | <h3 id="_region_based_memory_management">Region Based Memory Management</h3>\r |
| 313 | <div class="paragraph"><p>Region based memory management is an alternative automatic memory\r |
| 314 | management scheme to garbage collection. Regions can be inferred by\r |
| 315 | the compiler (e.g., Cyclone and MLKit) or provided to the programmer\r |
| 316 | through a library. Since many students do not have extensive\r |
| 317 | experience with compilers we plan on adopting the later approach.\r |
| 318 | Creating a viable region based memory solution requires the removal of\r |
| 319 | the GC and changes to the allocator. Additionally, write barriers\r |
| 320 | will be necessary to ensure references between two ML objects is never\r |
| 321 | established if the left hand side of the assignment has a longer\r |
| 322 | lifetime than the right hand side. Students will need to come up with\r |
| 323 | an appropriate interface for creating, entering, and exiting regions\r |
| 324 | (examples include RTSJ scoped memory and SCJ scoped memory).</p></div>\r |
| 325 | <div class="paragraph"><p>Background:</p></div>\r |
| 326 | <div class="openblock">\r |
| 327 | <div class="content">\r |
| 328 | <div class="ulist"><ul>\r |
| 329 | <li>\r |
| 330 | <p>\r |
| 331 | Cyclone\r |
| 332 | </p>\r |
| 333 | </li>\r |
| 334 | <li>\r |
| 335 | <p>\r |
| 336 | MLKit\r |
| 337 | </p>\r |
| 338 | </li>\r |
| 339 | <li>\r |
| 340 | <p>\r |
| 341 | RTSJ + SCJ scopes\r |
| 342 | </p>\r |
| 343 | </li>\r |
| 344 | </ul></div>\r |
| 345 | </div></div>\r |
| 346 | <div class="paragraph"><p>Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience</p></div>\r |
| 347 | </div>\r |
| 348 | <div class="sect2">\r |
| 349 | <h3 id="_adding_real_time_capabilities">Adding Real-Time Capabilities</h3>\r |
| 350 | <div class="paragraph"><p>This project focuses on exposing real-time APIs from a real-time OS\r |
| 351 | kernel at the SML level. This will require mapping the current MLton\r |
| 352 | (or <a href="http://multimlton.cs.purdue.edu">MultiMLton</a>) threading framework\r |
| 353 | to real-time threads that the RTOS provides. This will include\r |
| 354 | associating priorities with MLton threads and building priority based\r |
| 355 | scheduling algorithms. Additionally, support for perdioc, aperiodic,\r |
| 356 | and sporadic tasks should be supported. A real-time SML library will\r |
| 357 | need to be created to provide a forward facing interface for\r |
| 358 | programmers. Stretch goals include reworking the MLton <span class="monospaced">atomic</span>\r |
| 359 | statement and associated synchronization primitives built on top of\r |
| 360 | the MLton <span class="monospaced">atomic</span> statement.</p></div>\r |
| 361 | <div class="paragraph"><p>Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required</p></div>\r |
| 362 | </div>\r |
| 363 | <div class="sect2">\r |
| 364 | <h3 id="_real_time_garbage_collection">Real-Time Garbage Collection</h3>\r |
| 365 | <div class="paragraph"><p>This project focuses on modifications to the MLton GC to support\r |
| 366 | real-time garbage collection. We will model the real-time GC on the\r |
| 367 | Schism RTGC. The first task will be to create a fixed size runtime\r |
| 368 | object representation. Large structures will need to be represented\r |
| 369 | as a linked lists of fixed sized objects. Arrays and vectors will be\r |
| 370 | transferred into dense trees. Compaction and copying can therefore be\r |
| 371 | removed from the GC algorithms that MLton currently supports. Lastly,\r |
| 372 | the GC will be made concurrent, allowing for the execution of the GC\r |
| 373 | threads as the lowest priority task in the system. Stretch goals\r |
| 374 | include a priority aware mechanism for the GC to signal to real-time\r |
| 375 | ML threads that it needs to scan their stack and identification of\r |
| 376 | places where the stack is shallow to bound priority inversion during\r |
| 377 | this procedure.</p></div>\r |
| 378 | <div class="paragraph"><p>Recommended Skills: C programming experience; garbage collector experience a plus but not required</p></div>\r |
| 379 | </div>\r |
| 380 | </div>\r |
| 381 | </div>\r |
| 382 | </div>\r |
| 383 | <div id="footnotes"><hr></div>\r |
| 384 | <div id="footer">\r |
| 385 | <div id="footer-text">\r |
| 386 | </div>\r |
| 387 | <div id="footer-badges">\r |
| 388 | </div>\r |
| 389 | </div>\r |
| 390 | </body>\r |
| 391 | </html>\r |