1 Google Summer of Code (2015)
2 ============================
7 The following developers have agreed to serve as mentors for the 2015 Google Summer of Code:
9 * http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
10 * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
12 * http://people.cs.uchicago.edu/~jhr/[John Reppy]
13 * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan]
14 * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
20 === Implement a Partial Redundancy Elimination (PRE) Optimization ===
22 Partial redundancy elimination (PRE) is a program transformation that
23 removes operations that are redundant on some, but not necessarily all
24 paths, through the program. PRE can subsume both common subexpression
25 elimination and loop-invariant code motion, and is therefore a
26 potentially powerful optimization. However, a naïve implementation of
27 PRE on a program in static single assignment (SSA) form is unlikely to
28 be effective. This project aims to adapt and implement the GVN-PRE
29 algorithm of Thomas VanDrunen in MLton's SSA intermediate language.
33 * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
34 * http://www.cs.purdue.edu/research/technical_reports/2003/TR%2003-032.pdf[Corner-cases in Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
35 * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
36 * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based Partial Redundancy Elimination for Static Single Assignment Form]; Thomas VanDrunen and Antony L. Hosking
37 * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial Redundancy Elimination in SSA Form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
40 Recommended Skills: SML programming experience; some middle-end compiler experience
42 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
45 === Design and Implement a Heap Profiler ===
47 A heap profile is a description of the space usage of a program. A
48 heap profile is concerned with the allocation, retention, and
49 deallocation (via garbage collection) of heap data during the
50 execution of a program. A heap profile can be used to diagnose
51 performance problems in a functional program that arise from space
52 leaks. This project aims to design and implement a heap profiler for
53 MLton compiled programs.
57 * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
58 * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo
59 * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo
60 * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
63 Recommended Skills: C and SML programming experience; some experience with UI and visualization
66 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
69 === Garbage Collector Improvements ===
71 The garbage collector plays a significant role in the performance of
72 functional languages. Garbage collect too often, and program
73 performance suffers due to the excessive time spent in the garbage
74 collector. Garbage collect not often enough, and program performance
75 suffers due to the excessive space used by the uncollected
76 garbage. One particular issue is ensuring that a program utilizing a
77 garbage collector "plays nice" with other processes on the system, by
78 not using too much or too little physical memory. While there are some
79 reasonable theoretical results about garbage collections with heaps of
80 fixed size, there seems to be insufficient work that really looks
81 carefully at the question of dynamically resizing the heap in response
82 to the live data demands of the application and, similarly, in
83 response to the behavior of the operating system and other
84 processes. This project aims to investigate improvements to the memory
85 behavior of MLton compiled programs through better tuning of the
90 * http://gchandbook.org/[The Garbage Collection Handbook: The Art of Automatic Memory Management]; Richard Jones, Antony Hosking, Eliot Moss
91 * http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020[Dual-Mode Garbage Collection]; Patrick Sansom
92 * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic Heap Sizing: Taking Real Memory into Account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
93 * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
94 * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
95 * http://portal.acm.org/citation.cfm?doid=1806651.1806669[The Economics of Garbage Collection]; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján
96 * http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
97 * http://portal.acm.org/citation.cfm?doid=2555670.2466481[Control Theory for Principled Heap Sizing]; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones
100 Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
103 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
106 === Heap-allocated Activation Records ===
108 Activation records (a.k.a., stack frames) are traditionally allocated
109 on a stack. This naturally corresponds to the call-return pattern of
110 function invocation. However, there are some disadvantages to
111 stack-allocated activation records. In a functional programming
112 language, functions may be deeply recursive, resulting in call stacks
113 that are much larger than typically supported by the operating system;
114 hence, a functional programming language implementation will typically
115 store its stack in its heap. Furthermore, a functional programming
116 language implementation must handle and recover from stack overflow,
117 by allocating a larger stack (again, in its heap) and copying
118 activation records from the old stack to the new stack. In the
119 presence of threads, stacks must be allocated in a heap and, in the
120 presence of a garbage collector, should be garbage collected when
121 unreachable. While heap-allocated activation records avoid many of
122 these disadvantages, they have not been widely implemented. This
123 project aims to implement and evaluate heap-allocated activation
124 records in the MLton compiler.
128 * http://journals.cambridge.org/action/displayAbstract?aid=1295104[Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures]; Andrew W. Appel and Zhong Shao
129 * http://portal.acm.org/citation.cfm?doid=182590.156783[Space-efficient closure representations]; Zhong Shao and Andrew W. Appel
130 * http://portal.acm.org/citation.cfm?doid=93548.93554[Representing control in the presence of first-class continuations]; R. Hieb, R. Kent Dybvig, and Carl Bruggeman
133 Recommended Skills: SML programming experience; some middle- and back-end compiler experience
136 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
139 === Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML ===
142 http://en.wikipedia.org/wiki/IEEE_754-2008[IEEE Standard for Floating-Point Arithmetic (IEEE 754)]
143 is the de facto representation for floating-point computation.
144 However, it is a _binary_ (base 2) representation of floating-point
145 values, while many applications call for input and output of
146 floating-point values in _decimal_ (base 10) representation. The
147 _decimal-to-binary_ conversion problem takes a decimal floating-point
148 representation (e.g., a string like +"0.1"+) and returns the best
149 binary floating-point representation of that number. The
150 _binary-to-decimal_ conversion problem takes a binary floating-point
151 representation and returns a decimal floating-point representation
152 using the smallest number of digits that allow the decimal
153 floating-point representation to be converted to the original binary
154 floating-point representation. For both conversion routines, "best"
155 is dependent upon the current floating-point rounding mode.
157 MLton uses David Gay's
158 http://www.netlib.org/fp/gdtoa.tgz[gdtoa library] for floating-point
159 conversions. While this is an exellent library, it generalizes the
160 decimal-to-binary and binary-to-decimal conversion routines beyond
161 what is required by the
162 http://standardml.org/Basis/[Standard ML Basis Library] and induces an
163 external dependency on the compiler. Native implementations of these
164 conversion routines in Standard ML would obviate the dependency on the
165 +gdtoa+ library, while also being able to take advantage of Standard
166 ML features in the implementation (e.g., the published algorithms
167 often require use of infinite precision arithmetic, which is provided
168 by the +IntInf+ structure in Standard ML, but is provided in an ad hoc
169 fasion in the +gdtoa+ library).
171 This project aims to develop a native implementation of the conversion
172 routines in Standard ML.
176 * http://dl.acm.org/citation.cfm?doid=103162.103163[What every computer scientist should know about floating-point arithmetic]; David Goldberg
177 * http://dl.acm.org/citation.cfm?doid=93542.93559[How to print floating-point numbers accurately]; Guy L. Steele, Jr. and Jon L. White
178 * http://dl.acm.org/citation.cfm?doid=93542.93557[How to read floating point numbers accurately]; William D. Clinger
179 * http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz[Correctly Rounded Binary-Decimal and Decimal-Binary Conversions]; David Gay
180 * http://dl.acm.org/citation.cfm?doid=249069.231397[Printing floating-point numbers quickly and accurately]; Robert G. Burger and R. Kent Dybvig
181 * http://dl.acm.org/citation.cfm?doid=1806596.1806623[Printing floating-point numbers quickly and accurately with integers]; Florian Loitsch
184 Recommended Skills: SML programming experience; algorithm design and implementation
187 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
190 === Implement Source-level Debugging ===
192 Debugging is a fact of programming life. Unfortunately, most SML
193 implementations (including MLton) provide little to no source-level
194 debugging support. This project aims to add basic to intermediate
195 source-level debugging support to the MLton compiler. MLton already
196 supports source-level profiling, which can be used to attribute bytes
197 allocated or time spent in source functions. It should be relatively
198 straightforward to leverage this source-level information into basic
199 source-level debugging support, with the ability to set/unset
200 breakpoints and step through declarations and functions. It may be
201 possible to also provide intermediate source-level debugging support,
202 with the ability to inspect in-scope variables of basic types (e.g.,
203 types compatible with MLton's foreign function interface).
207 * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
208 * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
209 * http://dwarfstd.org/[DWARF Debugging Standard]
210 * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
213 Recommended Skills: SML programming experience; some compiler experience
216 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
219 === Region Based Memory Management ===
221 Region based memory management is an alternative automatic memory
222 management scheme to garbage collection. Regions can be inferred by
223 the compiler (e.g., Cyclone and MLKit) or provided to the programmer
224 through a library. Since many students do not have extensive
225 experience with compilers we plan on adopting the later approach.
226 Creating a viable region based memory solution requires the removal of
227 the GC and changes to the allocator. Additionally, write barriers
228 will be necessary to ensure references between two ML objects is never
229 established if the left hand side of the assignment has a longer
230 lifetime than the right hand side. Students will need to come up with
231 an appropriate interface for creating, entering, and exiting regions
232 (examples include RTSJ scoped memory and SCJ scoped memory).
241 Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
244 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
247 === Adding Real-Time Capabilities ===
249 This project focuses on exposing real-time APIs from a real-time OS
250 kernel at the SML level. This will require mapping the current MLton
251 (or http://multimlton.cs.purdue.edu[MultiMLton]) threading framework
252 to real-time threads that the RTOS provides. This will include
253 associating priorities with MLton threads and building priority based
254 scheduling algorithms. Additionally, support for perdioc, aperiodic,
255 and sporadic tasks should be supported. A real-time SML library will
256 need to be created to provide a forward facing interface for
257 programmers. Stretch goals include reworking the MLton +atomic+
258 statement and associated synchronization primitives built on top of
259 the MLton +atomic+ statement.
261 Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required
264 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
267 === Real-Time Garbage Collection ===
269 This project focuses on modifications to the MLton GC to support
270 real-time garbage collection. We will model the real-time GC on the
271 Schism RTGC. The first task will be to create a fixed size runtime
272 object representation. Large structures will need to be represented
273 as a linked lists of fixed sized objects. Arrays and vectors will be
274 transferred into dense trees. Compaction and copying can therefore be
275 removed from the GC algorithms that MLton currently supports. Lastly,
276 the GC will be made concurrent, allowing for the execution of the GC
277 threads as the lowest priority task in the system. Stretch goals
278 include a priority aware mechanism for the GC to signal to real-time
279 ML threads that it needs to scan their stack and identification of
280 places where the stack is shallow to bound priority inversion during
283 Recommended Skills: C programming experience; garbage collector experience a plus but not required
286 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
290 === Concurrent{nbsp}ML Improvements ===
292 http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
293 library based on synchronous message passing. MLton has a partial
294 implementation of the CML message-passing primitives, but its use in
295 real-world applications has been stymied by the lack of completeness
296 and thread-safe I/O libraries. This project would aim to flesh out
297 the CML implementation in MLton to be fully compatible with the
298 "official" version distributed as part of SML/NJ. Furthermore, time
299 permitting, runtime system support could be added to allow use of
300 modern OS features, such as asynchronous I/O, in the implementation of
301 CML's system interfaces.
305 * http://cml.cs.uchicago.edu/
306 * http://mlton.org/ConcurrentML
307 * http://mlton.org/ConcurrentMLImplementation
310 Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience
312 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
313 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
317 === SML3d Development ===
319 The SML3d Project is a collection of libraries to support 3D graphics
320 programming using Standard ML and the http://opengl.org/[OpenGL]
321 graphics API. It currently requires the MLton implementation of SML
322 and is supported on Linux, Mac OS X, and Microsoft Windows. There is
323 also support for http://www.khronos.org/opencl/[OpenCL]. This project
324 aims to continue development of the SML3d Project.
328 * http://sml3d.cs.uchicago.edu/
331 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]