Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | Google Summer of Code (2015) |
2 | ============================ | |
3 | :toc: | |
4 | ||
5 | == Mentors == | |
6 | ||
7 | The following developers have agreed to serve as mentors for the 2015 Google Summer of Code: | |
8 | ||
9 | * http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
10 | * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
11 | ///// | |
12 | * http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
13 | * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan] | |
14 | * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan] | |
15 | ///// | |
16 | ||
17 | == Ideas List == | |
18 | ||
19 | ///// | |
20 | === Implement a Partial Redundancy Elimination (PRE) Optimization === | |
21 | ||
22 | Partial redundancy elimination (PRE) is a program transformation that | |
23 | removes operations that are redundant on some, but not necessarily all | |
24 | paths, through the program. PRE can subsume both common subexpression | |
25 | elimination and loop-invariant code motion, and is therefore a | |
26 | potentially powerful optimization. However, a naïve implementation of | |
27 | PRE on a program in static single assignment (SSA) form is unlikely to | |
28 | be effective. This project aims to adapt and implement the GVN-PRE | |
29 | algorithm of Thomas VanDrunen in MLton's SSA intermediate language. | |
30 | ||
31 | Background: | |
32 | -- | |
33 | * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen | |
34 | * http://www.cs.purdue.edu/research/technical_reports/2003/TR%2003-032.pdf[Corner-cases in Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
35 | * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
36 | * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based Partial Redundancy Elimination for Static Single Assignment Form]; Thomas VanDrunen and Antony L. Hosking | |
37 | * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial Redundancy Elimination in SSA Form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow | |
38 | -- | |
39 | ||
40 | Recommended Skills: SML programming experience; some middle-end compiler experience | |
41 | ||
42 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
43 | ///// | |
44 | ||
45 | === Design and Implement a Heap Profiler === | |
46 | ||
47 | A heap profile is a description of the space usage of a program. A | |
48 | heap profile is concerned with the allocation, retention, and | |
49 | deallocation (via garbage collection) of heap data during the | |
50 | execution of a program. A heap profile can be used to diagnose | |
51 | performance problems in a functional program that arise from space | |
52 | leaks. This project aims to design and implement a heap profiler for | |
53 | MLton compiled programs. | |
54 | ||
55 | Background: | |
56 | -- | |
57 | * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones | |
58 | * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo | |
59 | * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo | |
60 | * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling | |
61 | -- | |
62 | ||
63 | Recommended Skills: C and SML programming experience; some experience with UI and visualization | |
64 | ||
65 | ///// | |
66 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
67 | ///// | |
68 | ||
69 | === Garbage Collector Improvements === | |
70 | ||
71 | The garbage collector plays a significant role in the performance of | |
72 | functional languages. Garbage collect too often, and program | |
73 | performance suffers due to the excessive time spent in the garbage | |
74 | collector. Garbage collect not often enough, and program performance | |
75 | suffers due to the excessive space used by the uncollected | |
76 | garbage. One particular issue is ensuring that a program utilizing a | |
77 | garbage collector "plays nice" with other processes on the system, by | |
78 | not using too much or too little physical memory. While there are some | |
79 | reasonable theoretical results about garbage collections with heaps of | |
80 | fixed size, there seems to be insufficient work that really looks | |
81 | carefully at the question of dynamically resizing the heap in response | |
82 | to the live data demands of the application and, similarly, in | |
83 | response to the behavior of the operating system and other | |
84 | processes. This project aims to investigate improvements to the memory | |
85 | behavior of MLton compiled programs through better tuning of the | |
86 | garbage collector. | |
87 | ||
88 | Background: | |
89 | -- | |
90 | * http://gchandbook.org/[The Garbage Collection Handbook: The Art of Automatic Memory Management]; Richard Jones, Antony Hosking, Eliot Moss | |
91 | * http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020[Dual-Mode Garbage Collection]; Patrick Sansom | |
92 | * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic Heap Sizing: Taking Real Memory into Account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss | |
93 | * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham | |
94 | * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski | |
95 | * http://portal.acm.org/citation.cfm?doid=1806651.1806669[The Economics of Garbage Collection]; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján | |
96 | * http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews | |
97 | * http://portal.acm.org/citation.cfm?doid=2555670.2466481[Control Theory for Principled Heap Sizing]; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones | |
98 | -- | |
99 | ||
100 | Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience | |
101 | ||
102 | ///// | |
103 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
104 | ///// | |
105 | ||
106 | === Heap-allocated Activation Records === | |
107 | ||
108 | Activation records (a.k.a., stack frames) are traditionally allocated | |
109 | on a stack. This naturally corresponds to the call-return pattern of | |
110 | function invocation. However, there are some disadvantages to | |
111 | stack-allocated activation records. In a functional programming | |
112 | language, functions may be deeply recursive, resulting in call stacks | |
113 | that are much larger than typically supported by the operating system; | |
114 | hence, a functional programming language implementation will typically | |
115 | store its stack in its heap. Furthermore, a functional programming | |
116 | language implementation must handle and recover from stack overflow, | |
117 | by allocating a larger stack (again, in its heap) and copying | |
118 | activation records from the old stack to the new stack. In the | |
119 | presence of threads, stacks must be allocated in a heap and, in the | |
120 | presence of a garbage collector, should be garbage collected when | |
121 | unreachable. While heap-allocated activation records avoid many of | |
122 | these disadvantages, they have not been widely implemented. This | |
123 | project aims to implement and evaluate heap-allocated activation | |
124 | records in the MLton compiler. | |
125 | ||
126 | Background: | |
127 | -- | |
128 | * http://journals.cambridge.org/action/displayAbstract?aid=1295104[Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures]; Andrew W. Appel and Zhong Shao | |
129 | * http://portal.acm.org/citation.cfm?doid=182590.156783[Space-efficient closure representations]; Zhong Shao and Andrew W. Appel | |
130 | * http://portal.acm.org/citation.cfm?doid=93548.93554[Representing control in the presence of first-class continuations]; R. Hieb, R. Kent Dybvig, and Carl Bruggeman | |
131 | -- | |
132 | ||
133 | Recommended Skills: SML programming experience; some middle- and back-end compiler experience | |
134 | ||
135 | ///// | |
136 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
137 | ///// | |
138 | ||
139 | === Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML === | |
140 | ||
141 | The | |
142 | http://en.wikipedia.org/wiki/IEEE_754-2008[IEEE Standard for Floating-Point Arithmetic (IEEE 754)] | |
143 | is the de facto representation for floating-point computation. | |
144 | However, it is a _binary_ (base 2) representation of floating-point | |
145 | values, while many applications call for input and output of | |
146 | floating-point values in _decimal_ (base 10) representation. The | |
147 | _decimal-to-binary_ conversion problem takes a decimal floating-point | |
148 | representation (e.g., a string like +"0.1"+) and returns the best | |
149 | binary floating-point representation of that number. The | |
150 | _binary-to-decimal_ conversion problem takes a binary floating-point | |
151 | representation and returns a decimal floating-point representation | |
152 | using the smallest number of digits that allow the decimal | |
153 | floating-point representation to be converted to the original binary | |
154 | floating-point representation. For both conversion routines, "best" | |
155 | is dependent upon the current floating-point rounding mode. | |
156 | ||
157 | MLton uses David Gay's | |
158 | http://www.netlib.org/fp/gdtoa.tgz[gdtoa library] for floating-point | |
159 | conversions. While this is an exellent library, it generalizes the | |
160 | decimal-to-binary and binary-to-decimal conversion routines beyond | |
161 | what is required by the | |
162 | http://standardml.org/Basis/[Standard ML Basis Library] and induces an | |
163 | external dependency on the compiler. Native implementations of these | |
164 | conversion routines in Standard ML would obviate the dependency on the | |
165 | +gdtoa+ library, while also being able to take advantage of Standard | |
166 | ML features in the implementation (e.g., the published algorithms | |
167 | often require use of infinite precision arithmetic, which is provided | |
168 | by the +IntInf+ structure in Standard ML, but is provided in an ad hoc | |
169 | fasion in the +gdtoa+ library). | |
170 | ||
171 | This project aims to develop a native implementation of the conversion | |
172 | routines in Standard ML. | |
173 | ||
174 | Background: | |
175 | -- | |
176 | * http://dl.acm.org/citation.cfm?doid=103162.103163[What every computer scientist should know about floating-point arithmetic]; David Goldberg | |
177 | * http://dl.acm.org/citation.cfm?doid=93542.93559[How to print floating-point numbers accurately]; Guy L. Steele, Jr. and Jon L. White | |
178 | * http://dl.acm.org/citation.cfm?doid=93542.93557[How to read floating point numbers accurately]; William D. Clinger | |
179 | * http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz[Correctly Rounded Binary-Decimal and Decimal-Binary Conversions]; David Gay | |
180 | * http://dl.acm.org/citation.cfm?doid=249069.231397[Printing floating-point numbers quickly and accurately]; Robert G. Burger and R. Kent Dybvig | |
181 | * http://dl.acm.org/citation.cfm?doid=1806596.1806623[Printing floating-point numbers quickly and accurately with integers]; Florian Loitsch | |
182 | -- | |
183 | ||
184 | Recommended Skills: SML programming experience; algorithm design and implementation | |
185 | ||
186 | ///// | |
187 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
188 | ///// | |
189 | ||
190 | === Implement Source-level Debugging === | |
191 | ||
192 | Debugging is a fact of programming life. Unfortunately, most SML | |
193 | implementations (including MLton) provide little to no source-level | |
194 | debugging support. This project aims to add basic to intermediate | |
195 | source-level debugging support to the MLton compiler. MLton already | |
196 | supports source-level profiling, which can be used to attribute bytes | |
197 | allocated or time spent in source functions. It should be relatively | |
198 | straightforward to leverage this source-level information into basic | |
199 | source-level debugging support, with the ability to set/unset | |
200 | breakpoints and step through declarations and functions. It may be | |
201 | possible to also provide intermediate source-level debugging support, | |
202 | with the ability to inspect in-scope variables of basic types (e.g., | |
203 | types compatible with MLton's foreign function interface). | |
204 | ||
205 | Background: | |
206 | -- | |
207 | * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works] | |
208 | * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types] | |
209 | * http://dwarfstd.org/[DWARF Debugging Standard] | |
210 | * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format] | |
211 | -- | |
212 | ||
213 | Recommended Skills: SML programming experience; some compiler experience | |
214 | ||
215 | ///// | |
216 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
217 | ///// | |
218 | ||
219 | === Region Based Memory Management === | |
220 | ||
221 | Region based memory management is an alternative automatic memory | |
222 | management scheme to garbage collection. Regions can be inferred by | |
223 | the compiler (e.g., Cyclone and MLKit) or provided to the programmer | |
224 | through a library. Since many students do not have extensive | |
225 | experience with compilers we plan on adopting the later approach. | |
226 | Creating a viable region based memory solution requires the removal of | |
227 | the GC and changes to the allocator. Additionally, write barriers | |
228 | will be necessary to ensure references between two ML objects is never | |
229 | established if the left hand side of the assignment has a longer | |
230 | lifetime than the right hand side. Students will need to come up with | |
231 | an appropriate interface for creating, entering, and exiting regions | |
232 | (examples include RTSJ scoped memory and SCJ scoped memory). | |
233 | ||
234 | Background: | |
235 | -- | |
236 | * Cyclone | |
237 | * MLKit | |
238 | * RTSJ + SCJ scopes | |
239 | -- | |
240 | ||
241 | Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience | |
242 | ||
243 | ///// | |
244 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
245 | ///// | |
246 | ||
247 | === Adding Real-Time Capabilities === | |
248 | ||
249 | This project focuses on exposing real-time APIs from a real-time OS | |
250 | kernel at the SML level. This will require mapping the current MLton | |
251 | (or http://multimlton.cs.purdue.edu[MultiMLton]) threading framework | |
252 | to real-time threads that the RTOS provides. This will include | |
253 | associating priorities with MLton threads and building priority based | |
254 | scheduling algorithms. Additionally, support for perdioc, aperiodic, | |
255 | and sporadic tasks should be supported. A real-time SML library will | |
256 | need to be created to provide a forward facing interface for | |
257 | programmers. Stretch goals include reworking the MLton +atomic+ | |
258 | statement and associated synchronization primitives built on top of | |
259 | the MLton +atomic+ statement. | |
260 | ||
261 | Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required | |
262 | ||
263 | ///// | |
264 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
265 | ///// | |
266 | ||
267 | === Real-Time Garbage Collection === | |
268 | ||
269 | This project focuses on modifications to the MLton GC to support | |
270 | real-time garbage collection. We will model the real-time GC on the | |
271 | Schism RTGC. The first task will be to create a fixed size runtime | |
272 | object representation. Large structures will need to be represented | |
273 | as a linked lists of fixed sized objects. Arrays and vectors will be | |
274 | transferred into dense trees. Compaction and copying can therefore be | |
275 | removed from the GC algorithms that MLton currently supports. Lastly, | |
276 | the GC will be made concurrent, allowing for the execution of the GC | |
277 | threads as the lowest priority task in the system. Stretch goals | |
278 | include a priority aware mechanism for the GC to signal to real-time | |
279 | ML threads that it needs to scan their stack and identification of | |
280 | places where the stack is shallow to bound priority inversion during | |
281 | this procedure. | |
282 | ||
283 | Recommended Skills: C programming experience; garbage collector experience a plus but not required | |
284 | ||
285 | ///// | |
286 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
287 | ///// | |
288 | ||
289 | ///// | |
290 | === Concurrent{nbsp}ML Improvements === | |
291 | ||
292 | http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency | |
293 | library based on synchronous message passing. MLton has a partial | |
294 | implementation of the CML message-passing primitives, but its use in | |
295 | real-world applications has been stymied by the lack of completeness | |
296 | and thread-safe I/O libraries. This project would aim to flesh out | |
297 | the CML implementation in MLton to be fully compatible with the | |
298 | "official" version distributed as part of SML/NJ. Furthermore, time | |
299 | permitting, runtime system support could be added to allow use of | |
300 | modern OS features, such as asynchronous I/O, in the implementation of | |
301 | CML's system interfaces. | |
302 | ||
303 | Background | |
304 | -- | |
305 | * http://cml.cs.uchicago.edu/ | |
306 | * http://mlton.org/ConcurrentML | |
307 | * http://mlton.org/ConcurrentMLImplementation | |
308 | -- | |
309 | ||
310 | Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience | |
311 | ||
312 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
313 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
314 | ///// | |
315 | ||
316 | ///// | |
317 | === SML3d Development === | |
318 | ||
319 | The SML3d Project is a collection of libraries to support 3D graphics | |
320 | programming using Standard ML and the http://opengl.org/[OpenGL] | |
321 | graphics API. It currently requires the MLton implementation of SML | |
322 | and is supported on Linux, Mac OS X, and Microsoft Windows. There is | |
323 | also support for http://www.khronos.org/opencl/[OpenCL]. This project | |
324 | aims to continue development of the SML3d Project. | |
325 | ||
326 | Background | |
327 | -- | |
328 | * http://sml3d.cs.uchicago.edu/ | |
329 | -- | |
330 | ||
331 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
332 | ///// |