Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / GoogleSummerOfCode2013.adoc
1 Google Summer of Code (2013)
2 ============================
3 :toc:
4
5 == Mentors ==
6
7 The following developers have agreed to serve as mentors for the 2013 Google Summer of Code:
8
9 * http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
10 * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
11 * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
12
13 == Ideas List ==
14
15 === Implement a Partial Redundancy Elimination (PRE) Optimization ===
16
17 Partial redundancy elimination (PRE) is a program transformation that
18 removes operations that are redundant on some, but not necessarily all
19 paths, through the program. PRE can subsume both common subexpression
20 elimination and loop-invariant code motion, and is therefore a
21 potentially powerful optimization. However, a naïve
22 implementation of PRE on a program in static single assignment (SSA)
23 form is unlikely to be effective. This project aims to adapt and
24 implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA
25 intermediate language.
26
27 Background:
28 --
29 * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking
30 * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
31 * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
32 * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
33 --
34
35 Recommended Skills: SML programming experience; some middle-end compiler experience
36
37 /////
38 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
39 /////
40
41 === Design and Implement a Heap Profiler ===
42
43 A heap profile is a description of the space usage of a program. A
44 heap profile is concerned with the allocation, retention, and
45 deallocation (via garbage collection) of heap data during the
46 execution of a program. A heap profile can be used to diagnose
47 performance problems in a functional program that arise from space
48 leaks. This project aims to design and implement a heap profiler for
49 MLton compiled programs.
50
51 Background:
52 --
53 * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
54 * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo
55 * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo
56 * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
57 --
58
59 Recommended Skills: C and SML programming experience; some experience with UI and visualization
60
61 /////
62 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
63 /////
64
65 === Garbage Collector Improvements ===
66
67 The garbage collector plays a significant role in the performance of
68 functional languages. Garbage collect too often, and program
69 performance suffers due to the excessive time spent in the garbage
70 collector. Garbage collect not often enough, and program performance
71 suffers due to the excessive space used by the uncollected garbage.
72 One particular issue is ensuring that a program utilizing a garbage
73 collector "plays nice" with other processes on the system, by not
74 using too much or too little physical memory. While there are some
75 reasonable theoretical results about garbage collections with heaps of
76 fixed size, there seems to be insufficient work that really looks
77 carefully at the question of dynamically resizing the heap in response
78 to the live data demands of the application and, similarly, in
79 response to the behavior of the operating system and other processes.
80 This project aims to investigate improvements to the memory behavior of
81 MLton compiled programs through better tuning of the garbage
82 collector.
83
84 Background:
85 --
86 * http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
87 * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
88 * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
89 * http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger
90 * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
91 --
92
93 Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
94
95 /////
96 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
97 /////
98
99 === Implement Successor{nbsp}ML Language Features ===
100
101 Any programming language, including Standard{nbsp}ML, can be improved.
102 The community has identified a number of modest extensions and
103 revisions to the Standard{nbsp}ML programming language that would
104 likely prove useful in practice. This project aims to implement these
105 language features in the MLton compiler.
106
107 Background:
108 --
109 * http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML]
110 * http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)]
111 * http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel
112 --
113
114 Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
115
116 /////
117 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
118 /////
119
120 === Implement Source-level Debugging ===
121
122 Debugging is a fact of programming life. Unfortunately, most SML
123 implementations (including MLton) provide little to no source-level
124 debugging support. This project aims to add basic to intermediate
125 source-level debugging support to the MLton compiler. MLton already
126 supports source-level profiling, which can be used to attribute bytes
127 allocated or time spent in source functions. It should be relatively
128 straightforward to leverage this source-level information into basic
129 source-level debugging support, with the ability to set/unset
130 breakpoints and step through declarations and functions. It may be
131 possible to also provide intermediate source-level debugging support,
132 with the ability to inspect in-scope variables of basic types (e.g.,
133 types compatible with MLton's foreign function interface).
134
135 Background:
136 --
137 * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
138 * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
139 * http://dwarfstd.org/[DWARF Debugging Standard]
140 * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
141 --
142
143 Recommended Skills: SML programming experience; some compiler experience
144
145 /////
146 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
147 /////
148
149 === SIMD Primitives ===
150
151 Most modern processors offer some direct support for SIMD (Single
152 Instruction, Multiple Data) operations, such as Intel's MMX/SSE
153 instructions, AMD's 3DNow! instructions, and IBM's AltiVec. Such
154 instructions are particularly useful for multimedia, scientific, and
155 cryptographic applications. This project aims to add preliminary
156 support for vector data and vector operations to the MLton compiler.
157 Ideally, after surveying SIMD instruction sets and SIMD support in
158 other compilers, a core set of SIMD primitives with broad architecture
159 and compiler support can be identified. After adding SIMD primitives
160 to the core compiler and carrying them through to the various
161 backends, there will be opportunities to design and implement an SML
162 library that exposes the primitives to the SML programmer as well as
163 opportunities to design and implement auto-vectorization
164 optimizations.
165
166 Background:
167 --
168 * http://en.wikipedia.org/wiki/SIMD[SIMD]
169 * http://gcc.gnu.org/projects/tree-ssa/vectorization.html[Auto-vectorization in GCC]
170 * http://llvm.org/docs/Vectorizers.html[Auto-vectorization in LLVM]
171 --
172
173 Recommended Skills: SML programming experience; some compiler experience; some computer architecture experience
174
175 /////
176 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
177 /////
178
179 === RTOS Support ===
180
181 This project entails porting the MLton compiler to RTOSs such as:
182 RTEMS, RT Linux, and FreeRTOS. The project will include modifications
183 to the MLton build and configuration process. Students will need to
184 extend the MLton configuration process for each of the RTOSs. The
185 MLton compilation process will need to be extended to invoke the C
186 cross compilers the RTOSs provide for embedded support. Test scripts
187 for validation will be necessary and these will need to be run in
188 emulators for supported architectures.
189
190 Recommended Skills: C programming experience; some scripting experience
191
192 /////
193 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
194 /////
195
196 === Region Based Memory Management ===
197
198 Region based memory management is an alternative automatic memory
199 management scheme to garbage collection. Regions can be inferred by
200 the compiler (e.g., Cyclone and MLKit) or provided to the programmer
201 through a library. Since many students do not have extensive
202 experience with compilers we plan on adopting the later approach.
203 Creating a viable region based memory solution requires the removal of
204 the GC and changes to the allocator. Additionally, write barriers
205 will be necessary to ensure references between two ML objects is never
206 established if the left hand side of the assignment has a longer
207 lifetime than the right hand side. Students will need to come up with
208 an appropriate interface for creating, entering, and exiting regions
209 (examples include RTSJ scoped memory and SCJ scoped memory).
210
211 Background:
212 --
213 * Cyclone
214 * MLKit
215 * RTSJ + SCJ scopes
216 --
217
218 Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
219
220 /////
221 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
222 /////
223
224 === Integration of Multi-MLton ===
225
226 http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime
227 environment that targets scalable multicore platforms. It is an
228 extension of MLton. It combines new language abstractions and
229 associated compiler analyses for expressing and implementing various
230 kinds of fine-grained parallelism (safe futures, speculation,
231 transactions, etc.), along with a sophisticated runtime system tuned
232 to efficiently handle large numbers of lightweight threads. The core
233 stable features of MultiMLton will need to be integrated with the
234 latest MLton public release. Certain experimental features, such as
235 support for the Intel SCC and distributed runtime will be omitted.
236 This project requires students to understand the delta between the
237 MultiMLton code base and the MLton code base. Students will need to
238 create build and configuration scripts for MLton to enable MultiMLton
239 features.
240
241 Background
242 --
243 * http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications]
244 --
245
246 Recommended Skills: SML programming experience; C programming experience; some compiler experience
247
248 /////
249 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
250 /////