ES6: Use ES6 modules. Update node.
[jackhill/mal.git] / process / guide.md
CommitLineData
0f4ca9d1
JM
1# The Make-A-Lisp Process
2
3So you want to write a Lisp interpreter? Welcome!
4
5The goal of the Make-A-Lisp project is to make it easy to write your
6own Lisp interpreter without sacrificing those many "Aha!" moments
7that come from ascending the McCarthy mountain. When you reach the peak
8of this particular mountain, you will have an interpreter for the mal
9Lisp language that is powerful enough to be self-hosting, meaning it
10will be able to run a mal interpreter written in mal itself.
11
12So jump right in (er ... start the climb)!
13
dc791440
DM
14- [Pick a language](#pick-a-language)
15- [Getting started](#getting-started)
16- [General hints](#general-hints)
17- [The Make-A-Lisp Process](#the-make-a-lisp-process-1)
18 - [Step 0: The REPL](#step-0-the-repl)
19 - [Step 1: Read and Print](#step-1-read-and-print)
20 - [Step 2: Eval](#step-2-eval)
21 - [Step 3: Environments](#step-3-environments)
22 - [Step 4: If Fn Do](#step-4-if-fn-do)
23 - [Step 5: Tail call optimization](#step-5-tail-call-optimization)
24 - [Step 6: Files, Mutation, and Evil](#step-6-files-mutation-and-evil)
25 - [Step 7: Quoting](#step-7-quoting)
26 - [Step 8: Macros](#step-8-macros)
27 - [Step 9: Try](#step-9-try)
28 - [Step A: Metadata, Self-hosting and Interop](#step-a-metadata-self-hosting-and-interop)
29
30
0f4ca9d1
JM
31## Pick a language
32
33You might already have a language in mind that you want to use.
34Technically speaking, mal can be implemented in any sufficiently
dd4020b9
JM
35complete programming language (i.e. Turing complete), however, there
36are a few language features that can make the task MUCH easier. Here
37are some of them in rough order of importance:
0f4ca9d1
JM
38
39* A sequential compound data structure (e.g. arrays, lists,
40 vectors, etc)
41* An associative compound data structure (e.g. a dictionary,
42 hash-map, associative array, etc)
43* Function references (first class functions, function pointers,
44 etc)
45* Real exception handling (try/catch, raise, throw, etc)
46* Variable argument functions (variadic, var args, splats, apply, etc)
47* Function closures
48* PCRE regular expressions
49
50In addition, the following will make your task especially easy:
51
bf518367
JM
52* Dynamic typing / boxed types (specifically, the ability to store
53 different data types in the sequential and associative structures
54 and the language keeps track of the type for you)
0f4ca9d1
JM
55* Compound data types support arbitrary runtime "hidden" data
56 (metadata, metatables, dynamic fields attributes)
57
58Here are some examples of languages that have all of the above
59features: JavaScript, Ruby, Python, Lua, R, Clojure.
60
cf70df1f
JM
61Michael Fogus has some great blog posts on interesting but less well
62known languages and many of the languages on his lists do not yet have
63any mal implementations:
64* http://blog.fogus.me/2011/08/14/perlis-languages/
65* http://blog.fogus.me/2011/10/18/programming-language-development-the-past-5-years/
66
0f4ca9d1
JM
67Many of the most popular languages already have Mal implementations.
68However, this should not discourage you from creating your own
69implementation in a language that already has one. However, if you go
70this route, I suggest you avoid referring to the existing
71implementations (i.e. "cheating") to maximize your learning experience
72instead of just borrowing mine. On the other hand, if your goal is to
73add new implementations to mal as efficiently as possible, then you
74SHOULD find the most similar target language implementation and refer
75to it frequently.
76
bf518367
JM
77If you want a fairly long list of programming languages with an
78approximate measure of popularity, try the [Programming Language
79Popularity Chart](http://langpop.corger.nl/)
80
0f4ca9d1
JM
81
82## Getting started
83
84* Install your chosen language interpreter/compiler, language package
85 manager and build tools (if applicable)
86
87* Fork the mal repository on github and then clone your forked
88 repository:
89```
90git clone git@github.com:YOUR_NAME/mal.git
91cd mal
92```
93
fc03712f
EC
94* Make a new directory for your implementation. For example, if your
95language is called "quux":
0f4ca9d1
JM
96```
97mkdir quux
98```
99
100* Modify the top level Makefile to allow the tests to be run against
101 your implementation. For example, if your language is named "quux"
102 and uses "qx" as the file extension, then make the following
103 3 modifications to Makefile:
104```
105IMPLS = ... quux ...
106...
107quux_STEP_TO_PROG = mylang/$($(1)).qx
20e8dea0
JM
108```
109
110* Add a "run" script to you implementation directory that listens to
111 the "STEP" environment variable for the implementation step to run
e1b080b4
JL
112 and defaults to "stepA_mal". Make sure the run script has the
113 executable file permission set (or else the test runner might fail with a
114 permission denied error message). The following are examples of "run"
20e8dea0
JM
115 scripts for a compiled language and an interpreted language (where
116 the interpreter is named "quux"):
117
118```
119#!/bin/bash
120exec $(dirname $0)/${STEP:-stepA_mal} "${@}"
121```
122
123```
124#!/bin/bash
125exec quux $(dirname $0)/${STEP:-stepA_mal}.qx "${@}"
0f4ca9d1
JM
126```
127
128This allows you to run tests against your implementation like this:
129```
e5737b08 130make "test^quux^stepX"
0f4ca9d1
JM
131```
132
a15f76bd 133If your implementation language is a compiled language, then you
396d869e 134should also add a Makefile at the top level of your implementation
a15f76bd
JM
135directory. This Makefile will define how to build the files pointed to
136by the quux_STEP_TO_PROG macro. The top-level Makefile will attempt to
137build those targets before running tests. If it is a scripting
396d869e
JM
138language/uncompiled, then no Makefile is necessary because
139quux_STEP_TO_PROG will point to a source file that already exists and
140does not need to be compiled/built.
141
0f4ca9d1
JM
142
143## General hints
144
145Stackoverflow and Google are your best friends. Modern polyglot
146developers do not memorize dozens of programming languages. Instead,
147they learn the peculiar terminology used with each language and then
148use this to search for their answers.
149
150Here are some other resources where multiple languages are
151compared/described:
152* http://learnxinyminutes.com/
153* http://hyperpolyglot.org/
154* http://rosettacode.org/
334a71b6 155* http://rigaux.org/language-study/syntax-across-languages/
0f4ca9d1
JM
156
157Do not let yourself be bogged down by specific problems. While the
158make-a-lisp process is structured as a series of steps, the reality is
159that building a lisp interpreter is more like a branching tree. If you
160get stuck on tail call optimization, or hash-maps, move on to other
161things. You will often have a stroke of inspiration for a problem as
162you work through other functionality. I have tried to structure this
45a8b3ca
JM
163guide and the tests to make clear which things can be deferred until
164later.
165
166An aside on deferrable/optional bits: when you run the tests for
167a given step, the last tests are often marked with an "optional"
168header. This indicates that these are tests for functionality that is
169not critical to finish a basic mal implementation. Many of the steps
170in this process guide have a "Deferrable" section, however, it is not
171quite the same meaning. Those sections include the functionality that
172is marked as optional in the tests, but they also include
0f4ca9d1
JM
173functionality that becomes mandatory at a later step. In other words,
174this is a "make your own Lisp adventure".
175
176Use test driven development. Each step of the make-a-lisp process has
177a bunch of tests associated with it and there is an easy script to run
178all the tests for a specific step in the process. Pick a failing test,
179fix it, repeat until all the tests for that step pass.
180
5c052692
WH
181## Reference Code
182
0f4ca9d1 183The `process` directory contains abbreviated pseudocode and
a15f76bd 184architecture diagrams for each step of the make-a-lisp process. Use
0f4ca9d1 185a textual diff/comparison tool to compare the previous pseudocode step
a15f76bd
JM
186with the one you are working on. The architecture diagram images have
187changes from the previous step highlighted in red. There is also
c1ef46cb
JM
188a concise
189[cheatsheet](http://kanaka.github.io/mal/process/cheatsheet.html) that
190summarizes the key changes at each step.
0f4ca9d1 191
daa1cf3f
JM
192If you get completely stuck and are feeling like giving up, then you
193should "cheat" by referring to the same step or functionality in
194a existing implementation language. You are here to learn, not to take
195a test, so do not feel bad about it. Okay, you should feel a little
196bit bad about it.
0f4ca9d1
JM
197
198
199## The Make-A-Lisp Process
200
201In the steps that follow the name of the target language is "quux" and
202the file extension for that language is "qx".
203
204
205<a name="step0"></a>
206
207### Step 0: The REPL
208
209![step0_repl architecture](step0_repl.png)
210
211This step is basically just creating a skeleton of your interpreter.
212
213* Create a `step0_repl.qx` file in `quux/`.
214
215* Add the 4 trivial functions `READ`, `EVAL`, `PRINT`, and `rep`
216 (read-eval-print). `READ`, `EVAL`, and `PRINT` are basically just
217 stubs that return their first parameter (a string if your target
218 language is a statically typed) and `rep` calls them in order
219 passing the return to the input of the next.
220
58ba5af4
JM
221* Add a main loop that repeatedly prints a prompt (needs to be
222 "user> " for later tests to pass), gets a line of input from the
223 user, calls `rep` with that line of input, and then prints out the
224 result from `rep`. It should also exit when you send it an EOF
225 (often Ctrl-D).
0f4ca9d1
JM
226
227* If you are using a compiled (ahead-of-time rather than just-in-time)
228 language, then create a Makefile (or appropriate project definition
229 file) in your directory.
230
daa1cf3f
JM
231It is time to run your first tests. This will check that your program
232does input and output in a way that can be captured by the test
233harness. Go to the top level and run the following:
234```
e5737b08 235make "test^quux^step0"
daa1cf3f 236```
0f4ca9d1
JM
237
238Add and then commit your new `step0_repl.qx` and `Makefile` to git.
239
240Congratulations! You have just completed the first step of the
241make-a-lisp process.
242
243
bd62ff74 244#### Optional:
0f4ca9d1
JM
245
246* Add full line editing and command history support to your
247 interpreter REPL. Many languages have a library/module that provide
248 line editing support. Another option if your language supports it is
249 to use an FFI (foreign function interface) to load and call directly
94a954f5 250 into GNU readline, editline, or linenoise library. Add line
0f4ca9d1
JM
251 editing interface code to `readline.qx`
252
253
254<a name="step1"></a>
255
256### Step 1: Read and Print
257
258![step1_read_print architecture](step1_read_print.png)
259
260In this step, your interpreter will "read" the string from the user
261and parse it into an internal tree data structure (an abstract syntax
262tree) and then take that data structure and "print" it back to
263a string.
264
265In non-lisp languages, this step (called "lexing and parsing") can be
266one of the most complicated parts of the compiler/interpreter. In
267Lisp, the data structure that you want in memory is basically
268represented directly in the code that the programmer writes
269(homoiconicity).
270
271For example, if the string is "(+ 2 (* 3 4))" then the read function
272will process this into a tree structure that looks like this:
273```
274 List
275 / | \
276 / | \
277 / | \
278 Sym:+ Int:2 List
279 / | \
280 / | \
281 / | \
282 Sym:* Int:3 Int:4
283```
284
285Each left paren and its matching right paren (lisp "sexpr") becomes
286a node in the tree and everything else becomes a leaf in the tree.
287
288If you can find code for an implementation of a JSON encoder/decoder
289in your target language then you can probably just borrow and modify
290that and be 75% of the way done with this step.
291
292The rest of this section is going to assume that you are not starting
293from an existing JSON encoder/decoder, but that you do have access to
294a Perl compatible regular expressions (PCRE) module/library. You can
295certainly implement the reader using simple string operations, but it
296is more involved. The `make`, `ps` (postscript) and Haskell
297implementations have examples of a reader/parser without using regular
298expression support.
299
300* Copy `step0_repl.qx` to `step1_read_print.qx`.
301
302* Add a `reader.qx` file to hold functions related to the reader.
303
304* If the target language has objects types (OOP), then the next step
305 is to create a simple stateful Reader object in `reader.qx`. This
306 object will store the tokens and a position. The Reader object will
d717d991 307 have two methods: `next` and `peek`. `next` returns the token at
0f4ca9d1
JM
308 the current position and increments the position. `peek` just
309 returns the token at the current position.
310
311* Add a function `read_str` in `reader.qx`. This function
312 will call `tokenizer` and then create a new Reader object instance
313 with the tokens. Then it will call `read_form` with the Reader
314 instance.
315
316* Add a function `tokenizer` in `reader.qx`. This function will take
2315fd53 317 a single string and return an array/list
0f4ca9d1
JM
318 of all the tokens (strings) in it. The following regular expression
319 (PCRE) will match all mal tokens.
320```
321[\s,]*(~@|[\[\]{}()'`~^@]|"(?:\\.|[^\\"])*"|;.*|[^\s\[\]{}('"`,;)]*)
322```
cfdf00cc
OR
323* For each match captured within the parenthesis starting at char 6 of the
324 regular expression a new token will be created.
325
326 * `[\s,]*`: Matches any number of whitespaces or commas. This is not captured
327 so it will be ignored and not tokenized.
328
329 * `~@`: Captures the special two-characters `~@` (tokenized).
330
331 * ```[\[\]{}()'`~^@]```: Captures any special single character, one of
185052f1 332 ```[]{}()'`~^@``` (tokenized).
cfdf00cc
OR
333
334 * `"(?:\\.|[^\\"])*"`: Starts capturing at a double-quote and stops at the
335 next double-quote unless it was proceeded by a backslash in which case it
a85b8454 336 includes it until the next double-quote (tokenized).
cfdf00cc
OR
337
338 * `;.*`: Captures any sequence of characters starting with `;` (tokenized).
339
340 * ```[^\s\[\]{}('"`,;)]*```: Captures a sequence of zero or more non special
341 characters (e.g. symbols, numbers, "true", "false", and "nil") and is sort
342 of the inverse of the one above that captures special characters (tokenized).
0f4ca9d1
JM
343
344* Add the function `read_form` to `reader.qx`. This function
345 will peek at the first token in the Reader object and switch on the
346 first character of that token. If the character is a left paren then
347 `read_list` is called with the Reader object. Otherwise, `read_atom`
348 is called with the Reader Object. The return value from `read_form`
349 is a mal data type. If your target language is statically typed then
350 you will need some way for `read_form` to return a variant or
351 subclass type. For example, if your language is object oriented,
2315fd53 352 then you can define a top level MalType (in `types.qx`) that all
0f4ca9d1
JM
353 your mal data types inherit from. The MalList type (which also
354 inherits from MalType) will contains a list/array of other MalTypes.
355 If your language is dynamically typed then you can likely just
356 return a plain list/array of other mal types.
357
358* Add the function `read_list` to `reader.qx`. This function will
359 repeatedly call `read_form` with the Reader object until it
360 encounters a ')' token (if it reach EOF before reading a ')' then
361 that is an error). It accumulates the results into a List type. If
362 your language does not have a sequential data type that can hold mal
363 type values you may need to implement one (in `types.qx`). Note
364 that `read_list` repeatedly calls `read_form` rather than
a85b8454 365 `read_atom`. This mutually recursive definition between `read_list`
0f4ca9d1
JM
366 and `read_form` is what allows lists to contain lists.
367
368* Add the function `read_atom` to `reader.qx`. This function will
369 look at the contents of the token and return the appropriate scalar
370 (simple/single) data type value. Initially, you can just implement
b84a50cc 371 numbers (integers) and symbols. This will allow you to proceed
0f4ca9d1
JM
372 through the next couple of steps before you will need to implement
373 the other fundamental mal types: nil, true, false, and string. The
374 remaining mal types: keyword, vector, hash-map, and atom do not
375 need to be implemented until step 9 (but can be implemented at any
376 point between this step and that). BTW, symbols types are just an
377 object that contains a single string name value (some languages have
378 symbol types already).
379
380* Add a file `printer.qx`. This file will contain a single function
381 `pr_str` which does the opposite of `read_str`: take a mal data
382 structure and return a string representation of it. But `pr_str` is
383 much simpler and is basically just a switch statement on the type of
384 the input object:
385
386 * symbol: return the string name of the symbol
387 * number: return the number as a string
388 * list: iterate through each element of the list calling `pr_str` on
389 it, then join the results with a space separator, and surround the
390 final result with parens
391
392* Change the `READ` function in `step1_read_print.qx` to call
393 `reader.read_str` and the `PRINT` function to call `printer.pr_str`.
394 `EVAL` continues to simply return its input but the type is now
395 a mal data type.
396
397You now have enough hooked up to begin testing your code. You can
398manually try some simple inputs:
399 * `123` -> `123`
400 * ` 123 ` -> `123`
401 * `abc` -> `abc`
402 * ` abc ` -> `abc`
403 * `(123 456)` -> `(123 456)`
404 * `( 123 456 789 ) ` -> `(123 456 789)`
6a767b0d 405 * `( + 2 (* 3 4) ) ` -> `(+ 2 (* 3 4))`
0f4ca9d1
JM
406
407To verify that your code is doing more than just eliminating extra
408spaces (and not failing), you can instrument your `reader.qx` functions.
409
410Once you have gotten past those simple manual tests, it is time to run
411the full suite of step 1 tests. Go to the top level and run the
412following:
413```
e5737b08 414make "test^quux^step1"
0f4ca9d1
JM
415```
416
417Fix any test failures related to symbols, numbers and lists.
418
419Depending on the functionality of your target language, it is likely
420that you have now just completed one of the most difficult steps. It
421is down hill from here. The remaining steps will probably be easier
422and each step will give progressively more bang for the buck.
423
45a8b3ca 424#### Deferrable:
0f4ca9d1
JM
425
426
0f4ca9d1
JM
427* Add support for the other basic data type to your reader and printer
428 functions: string, nil, true, and false. These become mandatory at
8d78bc26
JM
429 step 4. When a string is read, the following transformations are
430 applied: a backslash followed by a doublequote is translated into
431 a plain doublequote character, a backslash followed by "n" is
432 translated into a newline, and a backslash followed by another
433 backslash is translated into a single backslash. To properly print
434 a string (for step 4 string functions), the `pr_str` function needs
435 another parameter called `print_readably`. When `print_readably` is
436 true, doublequotes, newlines, and backslashes are translated into
437 their printed representations (the reverse of the reader). The
438 `PRINT` function in the main program should call `pr_str` with
439 print_readably set to true.
0f4ca9d1 440
4b7ee8fa
JM
441* Add error checking to your reader functions to make sure parens
442 are properly matched. Catch and print these errors in your main
443 loop. If your language does not have try/catch style bubble up
444 exception handling, then you will need to add explicit error
445 handling to your code to catch and pass on errors without crashing.
446
447* Add support for reader macros which are forms that are
448 transformed into other forms during the read phase. Refer to
449 `tests/step1_read_print.mal` for the form that these macros should
450 take (they are just simple transformations of the token stream).
451
1c2eab90 452* Add support for the other mal types: keyword, vector, hash-map.
2345a3da
JM
453 * keyword: a keyword is a token that begins with a colon. A keyword
454 can just be stored as a string with special unicode prefix like
455 0x29E (or char 0xff/127 if the target language does not have good
456 unicode support) and the printer translates strings with that
457 prefix back to the keyword representation. This makes it easy to
458 use keywords as hash map keys in most languages. You can also
459 store keywords as a unique data type, but you will need to make
460 sure they can be used as hash map keys (which may involve doing
461 a similar prefixed translation anyways).
462 * vector: a vector can be implemented with same underlying
463 type as a list as long as there is some mechanism to keep track of
464 the difference. You can use the same reader function for both
465 lists and vectors by adding parameters for the starting and ending
466 tokens.
467 * hash-map: a hash-map is an associative data structure that maps
468 strings to other mal values. If you implement keywords as prefixed
469 strings, then you only need a native associative data structure
470 which supports string keys. Clojure allows any value to be a hash
471 map key, but the base functionality in mal is to support strings
472 and keyword keys. Because of the representation of hash-maps as
473 an alternating sequence of keys and values, you can probably use
474 the same reader function for hash-maps as lists and vectors with
475 parameters to indicate the starting and ending tokens. The odd
476 tokens are then used for keys with the corresponding even tokens
477 as the values.
478
0f4ca9d1
JM
479* Add comment support to your reader. The tokenizer should ignore
480 tokens that start with ";". Your `read_str` function will need to
481 properly handle when the tokenizer returns no values. The simplest
482 way to do this is to return `nil` mal value. A cleaner option (that
483 does not print `nil` at the prompt is to throw a special exception
484 that causes the main loop to simply continue at the beginning of the
485 loop without calling `rep`.
486
487
488<a name="step2"></a>
489
490### Step 2: Eval
491
492![step2_eval architecture](step2_eval.png)
493
494In step 1 your mal interpreter was basically just a way to validate
495input and eliminate extraneous white space. In this step you will turn
496your interpreter into a simple number calculator by adding
497functionality to the evaluator (`EVAL`).
498
499Compare the pseudocode for step 1 and step 2 to get a basic idea of
500the changes that will be made during this step:
501```
502diff -urp ../process/step1_read_print.txt ../process/step2_eval.txt
503```
504
505* Copy `step1_read_print.qx` to `step2_eval.qx`.
506
507* Define a simple initial REPL environment. This environment is an
508 associative structure that maps symbols (or symbol names) to
509 numeric functions. For example, in python this would look something
510 like this:
511```
512repl_env = {'+': lambda a,b: a+b,
513 '-': lambda a,b: a-b,
514 '*': lambda a,b: a*b,
515 '/': lambda a,b: int(a/b)}
516```
517
518* Modify the `rep` function to pass the REPL environment as the second
519 parameter for the `EVAL` call.
520
521* Create a new function `eval_ast` which takes `ast` (mal data type)
522 and an associative structure (the environment from above).
523 `eval_ast` switches on the type of `ast` as follows:
524
525 * symbol: lookup the symbol in the environment structure and return
0198b7a2 526 the value or raise an error if no value is found
0f4ca9d1
JM
527 * list: return a new list that is the result of calling `EVAL` on
528 each of the members of the list
529 * otherwise just return the original `ast` value
530
531* Modify `EVAL` to check if the first parameter `ast` is a list.
532 * `ast` is not a list: then return the result of calling `eval_ast`
533 on it.
efa2daef 534 * `ast` is a empty list: return ast unchanged.
0f4ca9d1
JM
535 * `ast` is a list: call `eval_ast` to get a new evaluated list. Take
536 the first item of the evaluated list and call it as function using
537 the rest of the evaluated list as its arguments.
538
539If your target language does not have full variable length argument
540support (e.g. variadic, vararg, splats, apply) then you will need to
541pass the full list of arguments as a single parameter and split apart
542the individual values inside of every mal function. This is annoying,
543but workable.
544
545The process of taking a list and invoking or executing it to return
546something new is known in Lisp as the "apply" phase.
547
548Try some simple expressions:
549
550 * `(+ 2 3)` -> `5`
551 * `(+ 2 (* 3 4))` -> `14`
552
553The most likely challenge you will encounter is how to properly call
554a function references using an arguments list.
555
556Now go to the top level, run the step 2 tests and fix the errors.
557```
e5737b08 558make "test^quux^step2"
0f4ca9d1
JM
559```
560
561You now have a simple prefix notation calculator!
562
0748211b
DM
563#### Deferrable:
564
565* `eval_ast` should evaluate elements of vectors and hash-maps. Add the
566 following cases in `eval_ast`:
567 * If `ast` is a vector: return a new vector that is the result of calling
568 `EVAL` on each of the members of the vector.
569 * If `ast` is a hash-map: return a new hash-map which consists of key-value
570 pairs where the key is a key from the hash-map and the value is the result
571 of calling `EVAL` on the corresponding value.
572
0f4ca9d1
JM
573
574<a name="step3"></a>
575
576### Step 3: Environments
577
578![step3_env architecture](step3_env.png)
579
580In step 2 you were already introduced to REPL environment (`repl_env`)
581where the basic numeric functions were stored and looked up. In this
582step you will add the ability to create new environments (`let*`) and
daa1cf3f 583modify existing environments (`def!`).
0f4ca9d1
JM
584
585A Lisp environment is an associative data structure that maps symbols (the
586keys) to values. But Lisp environments have an additional important
587function: they can refer to another environment (the outer
588environment). During environment lookups, if the current environment
589does not have the symbol, the lookup continues in the outer
590environment, and continues this way until the symbol is either found,
591or the outer environment is `nil` (the outermost environment in the
592chain).
593
594Compare the pseudocode for step 2 and step 3 to get a basic idea of
595the changes that will be made during this step:
596```
597diff -urp ../process/step2_eval.txt ../process/step3_env.txt
598```
599
d9c020b0 600* Copy `step2_eval.qx` to `step3_env.qx`.
0f4ca9d1
JM
601
602* Create `env.qx` to hold the environment definition.
603
604* Define an `Env` object that is instantiated with a single `outer`
605 parameter and starts with an empty associative data structure
606 property `data`.
607
608* Define three methods for the Env object:
609 * set: takes a symbol key and a mal value and adds to the `data`
610 structure
611 * find: takes a symbol key and if the current environment contains
612 that key then return the environment. If no key is found and outer
613 is not `nil` then call find (recurse) on the outer environment.
614 * get: takes a symbol key and uses the `find` method to locate the
615 environment with the key, then returns the matching value. If no
616 key is found up the outer chain, then throws/raises a "not found"
617 error.
618
169ddeb2 619* Update `step3_env.qx` to use the new `Env` type to create the
0f4ca9d1
JM
620 repl_env (with a `nil` outer value) and use the `set` method to add
621 the numeric functions.
622
623* Modify `eval_ast` to call the `get` method on the `env` parameter.
624
625* Modify the apply section of `EVAL` to switch on the first element of
626 the list:
627 * symbol "def!": call the set method of the current environment
628 (second parameter of `EVAL` called `env`) using the unevaluated
629 first parameter (second list element) as the symbol key and the
630 evaluated second parameter as the value.
a727d6e8 631 * symbol "let\*": create a new environment using the current
0f4ca9d1 632 environment as the outer value and then use the first parameter as
a727d6e8
MK
633 a list of new bindings in the "let\*" environment. Take the second
634 element of the binding list, call `EVAL` using the new "let\*"
0f4ca9d1 635 environment as the evaluation environment, then call `set` on the
a727d6e8 636 "let\*" environment using the first binding list element as the key
0f4ca9d1
JM
637 and the evaluated second element as the value. This is repeated
638 for each odd/even pair in the binding list. Note in particular,
639 the bindings earlier in the list can be referred to by later
640 bindings. Finally, the second parameter (third element) of the
a727d6e8 641 original `let*` form is evaluated using the new "let\*" environment
0f4ca9d1
JM
642 and the result is returned as the result of the `let*` (the new
643 let environment is discarded upon completion).
644 * otherwise: call `eval_ast` on the list and apply the first element
645 to the rest as before.
646
647`def!` and `let*` are Lisp "specials" (or "special atoms") which means
648that they are language level features and more specifically that the
649rest of the list elements (arguments) may be evaluated differently (or
650not at all) unlike the default apply case where all elements of the
651list are evaluated before the first element is invoked. Lists which
652contain a "special" as the first element are known as "special forms".
34935988 653They are special because they follow special evaluation rules.
0f4ca9d1
JM
654
655Try some simple environment tests:
656
657 * `(def! a 6)` -> `6`
658 * `a` -> `6`
659 * `(def! b (+ a 2))` -> `8`
660 * `(+ a b)` -> `14`
661 * `(let* (c 2) c)` -> `2`
662
663Now go to the top level, run the step 3 tests and fix the errors.
664```
e5737b08 665make "test^quux^step3"
0f4ca9d1
JM
666```
667
668You mal implementation is still basically just a numeric calculator
669with save/restore capability. But you have set the foundation for step
6704 where it will begin to feel like a real programming language.
671
672
673An aside on mutation and typing:
674
675The "!" suffix on symbols is used to indicate that this symbol refers
676to a function that mutates something else. In this case, the `def!`
677symbol indicates a special form that will mutate the current
678environment. Many (maybe even most) of runtime problems that are
679encountered in software engineering are a result of mutation. By
680clearly marking code where mutation may occur, you can more easily
681track down the likely cause of runtime problems when they do occur.
682
683Another cause of runtime errors is type errors, where a value of one
684type is unexpectedly treated by the program as a different and
685incompatible type. Statically typed languages try to make the
686programmer solve all type problems before the program is allowed to
687run. Most Lisp variants tend to be dynamically typed (types of values
688are checked when they are actually used at runtime).
689
690As an aside-aside: The great debate between static and dynamic typing
4e3fd73b
MK
691can be understood by following the money. Advocates of strict static
692typing use words like "correctness" and "safety" and thus get
0f4ca9d1
JM
693government and academic funding. Advocates of dynamic typing use words
694like "agile" and "time-to-market" and thus get venture capital and
695commercial funding.
696
697
698<a name="step4"></a>
699
700### Step 4: If Fn Do
701
702![step4_if_fn_do architecture](step4_if_fn_do.png)
703
704In step 3 you added environments and the special forms for
705manipulating environments. In this step you will add 3 new special
706forms (`if`, `fn*` and `do`) and add several more core functions to
707the default REPL environment. Our new architecture will look like
708this:
709
710The `fn*` special form is how new user-defined functions are created.
711In some Lisps, this special form is named "lambda".
712
713Compare the pseudocode for step 3 and step 4 to get a basic idea of
714the changes that will be made during this step:
715```
716diff -urp ../process/step3_env.txt ../process/step4_if_fn_do.txt
717```
718
719* Copy `step3_env.qx` to `step4_if_fn_do.qx`.
720
721* If you have not implemented reader and printer support (and data
722 types) for `nil`, `true` and `false`, you will need to do so for
723 this step.
724
725* Update the constructor/initializer for environments to take two new
726 arguments: `binds` and `exprs`. Bind (`set`) each element (symbol)
727 of the binds list to the respective element of the `exprs` list.
728
729* Add support to `printer.qx` to print functions values. A string
730 literal like "#<function>" is sufficient.
731
1af1f7f1 732* Add the following special forms to `EVAL`:
0f4ca9d1 733
1af1f7f1 734 * `do`: Evaluate all the elements of the list using `eval_ast`
f558a0c8 735 and return the final evaluated element.
0f4ca9d1
JM
736 * `if`: Evaluate the first parameter (second element). If the result
737 (condition) is anything other than `nil` or `false`, then evaluate
1af1f7f1 738 the second parameter (third element of the list) and return the
0f4ca9d1
JM
739 result. Otherwise, evaluate the third parameter (fourth element)
740 and return the result. If condition is false and there is no third
741 parameter, then just return `nil`.
742 * `fn*`: Return a new function closure. The body of that closure
743 does the following:
744 * Create a new environment using `env` (closed over from outer
745 scope) as the `outer` parameter, the first parameter (second
746 list element of `ast` from the outer scope) as the `binds`
747 parameter, and the parameters to the closure as the `exprs`
748 parameter.
749 * Call `EVAL` on the second parameter (third list element of `ast`
750 from outer scope), using the new environment. Use the result as
751 the return value of the closure.
752
753If your target language does not support closures, then you will need
754to implement `fn*` using some sort of structure or object that stores
755the values being closed over: the first and second elements of the
756`ast` list (function parameter list and function body) and the current
757environment `env`. In this case, your native functions will need to be
758wrapped in the same way. You will probably also need a method/function
759that invokes your function object/structure for the default case of
760the apply section of `EVAL`.
761
762Try out the basic functionality you have implemented:
763
764 * `(fn* [a] a)` -> `#<function>`
765 * `( (fn* [a] a) 7)` -> `7`
766 * `( (fn* [a] (+ a 1)) 10)` -> `11`
767 * `( (fn* [a b] (+ a b)) 2 3)` -> `5`
768
769* Add a new file `core.qx` and define an associative data structure
770 `ns` (namespace) that maps symbols to functions. Move the numeric
771 function definitions into this structure.
772
773* Modify `step4_if_fn_do.qx` to iterate through the `core.ns`
774 structure and add (`set`) each symbol/function mapping to the
775 REPL environment (`repl_env`).
776
777* Add the following functions to `core.ns`:
4b7ee8fa
JM
778 * `prn`: call `pr_str` on the first parameter with `print_readably`
779 set to true, prints the result to the screen and then return
a1eb30fc 780 `nil`. Note that the full version of `prn` is a deferrable below.
0f4ca9d1
JM
781 * `list`: take the parameters and return them as a list.
782 * `list?`: return true if the first parameter is a list, false
783 otherwise.
784 * `empty?`: treat the first parameter as a list and return true if
785 the list is empty and false if it contains any elements.
786 * `count`: treat the first parameter as a list and return the number
787 of elements that it contains.
788 * `=`: compare the first two parameters and return true if they are
789 the same type and contain the same value. In the case of equal
790 length lists, each element of the list should be compared for
791 equality and if they are the same return true, otherwise false.
792 * `<`, `<=`, `>`, and `>=`: treat the first two parameters as
793 numbers and do the corresponding numeric comparison, returning
794 either true or false.
795
796Now go to the top level, run the step 4 tests. There are a lot of
797tests in step 4 but all of the non-optional tests that do not involve
798strings should be able to pass now.
799
800```
e5737b08 801make "test^quux^step4"
0f4ca9d1
JM
802```
803
804Your mal implementation is already beginning to look like a real
805language. You have flow control, conditionals, user-defined functions
806with lexical scope, side-effects (if you implement the string
1380d8c1
JM
807functions), etc. However, our little interpreter has not quite reached
808Lisp-ness yet. The next several steps will take your implementation
809from a neat toy to a full featured language.
0f4ca9d1 810
45a8b3ca 811#### Deferrable:
0f4ca9d1
JM
812
813* Implement Clojure-style variadic function parameters. Modify the
814 constructor/initializer for environments, so that if a "&" symbol is
815 encountered in the `binds` list, the next symbol in the `binds` list
816 after the "&" is bound to the rest of the `exprs` list that has not
817 been bound yet.
818
1c2eab90 819* Define a `not` function using mal itself. In `step4_if_fn_do.qx`
0f4ca9d1
JM
820 call the `rep` function with this string:
821 "(def! not (fn* (a) (if a false true)))".
822
823* Implement the strings functions in `core.qx`. To implement these
824 functions, you will need to implement the string support in the
45a8b3ca 825 reader and printer (deferrable section of step 1). Each of the string
0f4ca9d1
JM
826 functions takes multiple mal values, prints them (`pr_str`) and
827 joins them together into a new string.
828 * `pr-str`: calls `pr_str` on each argument with `print_readably`
829 set to true, joins the results with " " and returns the new
830 string.
831 * `str`: calls `pr_str` on each argument with `print_readably` set
832 to false, concatenates the results together ("" separator), and
833 returns the new string.
834 * `prn`: calls `pr_str` on each argument with `print_readably` set
835 to true, joins the results with " ", prints the string to the
836 screen and then returns `nil`.
837 * `println`: calls `pr_str` on each argument with `print_readably` set
838 to false, joins the results with " ", prints the string to the
839 screen and then returns `nil`.
840
841
842<a name="step5"></a>
843
844### Step 5: Tail call optimization
845
846![step5_tco architecture](step5_tco.png)
847
848In step 4 you added special forms `do`, `if` and `fn*` and you defined
849some core functions. In this step you will add a Lisp feature called
850tail call optimization (TCO). Also called "tail recursion" or
851sometimes just "tail calls".
852
853Several of the special forms that you have defined in `EVAL` end up
854calling back into `EVAL`. For those forms that call `EVAL` as the last
855thing that they do before returning (tail call) you will just loop back
856to the beginning of eval rather than calling it again. The advantage
857of this approach is that it avoids adding more frames to the call
1c2eab90
JK
858stack. This is especially important in Lisp languages because they tend
859to prefer using recursion instead of iteration for control structures.
860(Though some Lisps, such as Common Lisp, have iteration.) However, with
861tail call optimization, recursion can be made as stack efficient as
862iteration.
0f4ca9d1
JM
863
864Compare the pseudocode for step 4 and step 5 to get a basic idea of
865the changes that will be made during this step:
866```
867diff -urp ../process/step4_if_fn_do.txt ../process/step5_tco.txt
868```
869
0303d604 870* Copy `step4_if_fn_do.qx` to `step5_tco.qx`.
0f4ca9d1
JM
871
872* Add a loop (e.g. while true) around all code in `EVAL`.
873
874* Modify each of the following form cases to add tail call recursion
875 support:
876 * `let*`: remove the final `EVAL` call on the second `ast` argument
877 (third list element). Set `env` (i.e. the local variable passed in
878 as second parameter of `EVAL`) to the new let environment. Set
879 `ast` (i.e. the local variable passed in as first parameter of
880 `EVAL`) to be the second `ast` argument. Continue at the beginning
881 of the loop (no return).
882 * `do`: change the `eval_ast` call to evaluate all the parameters
4e7296f9
JM
883 except for the last (2nd list element up to but not including
884 last). Set `ast` to the last element of `ast`. Continue
0f4ca9d1
JM
885 at the beginning of the loop (`env` stays unchanged).
886 * `if`: the condition continues to be evaluated, however, rather
887 than evaluating the true or false branch, `ast` is set to the
888 unevaluated value of the chosen branch. Continue at the beginning
889 of the loop (`env` is unchanged).
890
891* The return value from the `fn*` special form will now become an
892 object/structure with attributes that allow the default invoke case
893 of `EVAL` to do TCO on mal functions. Those attributes are:
0f4ca9d1
JM
894 * `ast`: the second `ast` argument (third list element) representing
895 the body of the function.
896 * `params`: the first `ast` argument (second list element)
897 representing the parameter names of the function.
898 * `env`: the current value of the `env` parameter of `EVAL`.
4e7296f9
JM
899 * `fn`: the original function value (i.e. what was return by `fn*`
900 in step 4). Note that this is deferrable until step 9 when it is
16ca60da
JM
901 required for the `map` and `apply` core functions). You will also
902 need it in step 6 if you choose to not to defer atoms/`swap!` from
903 that step.
0f4ca9d1
JM
904
905* The default "apply"/invoke case of `EVAL` must now be changed to
906 account for the new object/structure returned by the `fn*` form.
907 Continue to call `eval_ast` on `ast`. The first element is `f`.
908 Switch on the type of `f`:
909 * regular function (not one defined by `fn*`): apply/invoke it as
4e7296f9 910 before (in step 4).
0f4ca9d1
JM
911 * a `fn*` value: set `ast` to the `ast` attribute of `f`. Generate
912 a new environment using the `env` and `params` attributes of `f`
913 as the `outer` and `binds` arguments and rest `ast` arguments
914 (list elements 2 through the end) as the `exprs` argument. Set
915 `env` to the new environment. Continue at the beginning of the loop.
916
917Run some manual tests from previous steps to make sure you have not
918broken anything by adding TCO.
919
920Now go to the top level, run the step 5 tests.
921
922```
e5737b08 923make "test^quux^step5"
0f4ca9d1
JM
924```
925
926Look at the step 5 test file `tests/step5_tco.mal`. The `sum-to`
927function cannot be tail call optimized because it does something after
928the recursive call (`sum-to` calls itself and then does the addition).
929Lispers say that the `sum-to` is not in tail position. The `sum2`
930function however, calls itself from tail position. In other words, the
931recursive call to `sum2` is the last action that `sum2` does. Calling
932`sum-to` with a large value will cause a stack overflow exception in
933most target languages (some have super-special tricks they use to
934avoid stack overflows).
935
936Congratulations, your mal implementation already has a feature (TCO)
937that most mainstream languages lack.
938
939
940<a name="step6"></a>
941
396d869e 942### Step 6: Files, Mutation, and Evil
0f4ca9d1
JM
943
944![step6_file architecture](step6_file.png)
945
946In step 5 you added tail call optimization. In this step you will add
947some string and file operations and give your implementation a touch
948of evil ... er, eval. And as long as your language supports function
949closures, this step will be quite simple. However, to complete this
950step, you must implement string type support, so if you have been
951holding off on that you will need to go back and do so.
952
953Compare the pseudocode for step 5 and step 6 to get a basic idea of
954the changes that will be made during this step:
955```
956diff -urp ../process/step5_tco.txt ../process/step6_file.txt
957```
958
ffd31966
JM
959* Copy `step5_tco.qx` to `step6_file.qx`.
960
0f4ca9d1
JM
961* Add two new string functions to the core namespaces:
962 * `read-string`: this function just exposes the `read_str` function
963 from the reader. If your mal string type is not the same as your
964 target language (e.g. statically typed language) then your
965 `read-string` function will need to unbox (extract) the raw string
966 from the mal string type in order to call `read_str`.
967 * `slurp`: this function takes a file name (string) and returns the
968 contents of the file as a string. Once again, if your mal string
969 type wraps a raw target language string, then you will need to
970 unmarshall (extract) the string parameter to get the raw file name
971 string and marshall (wrap) the result back to a mal string type.
972
6fef8e58
JM
973* In your main program, add a new symbol "eval" to your REPL
974 environment. The value of this new entry is a function that takes
975 a single argument `ast`. The closure calls the your `EVAL` function
976 using the `ast` as the first argument and the REPL environment
977 (closed over from outside) as the second argument. The result of
978 the `EVAL` call is returned. This simple but powerful addition
979 allows your program to treat mal data as a mal program. For example,
cfdf00cc 980 you can now to this:
6fef8e58
JM
981```
982(def! mal-prog (list + 1 2))
983(eval mal-prog)
984```
0f4ca9d1
JM
985
986* Define a `load-file` function using mal itself. In your main
987 program call the `rep` function with this string:
988 "(def! load-file (fn* (f) (eval (read-string (str \"(do \" (slurp f) \")\")))))".
989
990Try out `load-file`:
991 * `(load-file "../tests/incA.mal")` -> `9`
992 * `(inc4 3)` -> `7`
993
994The `load-file` function does the following:
995 * Call `slurp` to read in a file by name. Surround the contents with
996 "(do ...)" so that the whole file will be treated as a single
997 program AST (abstract syntax tree).
998 * Call `read-string` on the string returned from `slurp`. This uses
999 the reader to read/convert the file contents into mal data/AST.
1000 * Call `eval` (the one in the REPL environment) on the AST returned
1001 from `read-string` to "run" it.
1002
627bd6f7
DM
1003Besides adding file and eval support, we'll add support for the atom data type
1004in this step. An atom is the Mal way to represent *state*; it is
1005heavily inspired by [Clojure's atoms](http://clojure.org/state). An atom holds
1006a reference to a single Mal value of any type; it supports reading that Mal value
1007and *modifying* the reference to point to another Mal value. Note that this is
1008the only Mal data type that is mutable (but the Mal values it refers to are
1009still immutable; immutability is explained in greater detail in step 7).
a85b8454 1010You'll need to add 5 functions to the core namespace to support atoms:
627bd6f7
DM
1011
1012 * `atom`: Takes a Mal value and returns a new atom which points to that Mal value.
1013 * `atom?`: Takes an argument and returns `true` if the argument is an atom.
1014 * `deref`: Takes an atom argument and returns the Mal value referenced by this atom.
1015 * `reset!`: Takes an atom and a Mal value; the atom is modified to refer to
1016 the given Mal value. The Mal value is returned.
1017 * `swap!`: Takes an atom, a function, and zero or more function arguments. The
1c2eab90 1018 atom's value is modified to the result of applying the function with the atom's
627bd6f7
DM
1019 value as the first argument and the optionally given function arguments as
1020 the rest of the arguments. The new atom's value is returned. (Side note: Mal is
1021 single-threaded, but in concurrent languages like Clojure, `swap!` promises
1022 atomic update: `(swap! myatom (fn* [x] (+ 1 x)))` will always increase the
1023 `myatom` counter by one and will not suffer from missing updates when the
1024 atom is updated from multiple threads.)
1025
1026Optionally, you can add a reader macro `@` which will serve as a short form for
1027`deref`, so that `@a` is equivalent to `(deref a)`. In order to do that, modify
1028the conditional in reader `read_form` function and add a case which deals with
1029the `@` token: if the token is `@` (at sign) then return a new list that
1030contains the symbol `deref` and the result of reading the next form
1031(`read_form`).
1032
0f4ca9d1 1033Now go to the top level, run the step 6 tests. The optional tests will
627bd6f7
DM
1034need support from the reader for comments, vectors, hash-maps and the `@`
1035reader macro:
0f4ca9d1 1036```
e5737b08 1037make "test^quux^step6"
0f4ca9d1
JM
1038```
1039
1040Congratulations, you now have a full-fledged scripting language that
6fef8e58
JM
1041can run other mal programs. The `slurp` function loads a file as
1042a string, the `read-string` function calls the mal reader to turn that
10a76646 1043string into data, and the `eval` function takes data and evaluates it
6fef8e58
JM
1044as a normal mal program. However, it is important to note that the
1045`eval` function is not just for running external programs. Because mal
1046programs are regular mal data structures, you can dynamically generate
1047or manipulate those data structures before calling `eval` on them.
a85b8454 1048This isomorphism (same shape) between data and programs is known as
6fef8e58
JM
1049"homoiconicity". Lisp languages are homoiconic and this property
1050distinguishes them from most other programming languages.
1051
1052You mal implementation is quite powerful already but the set of
1053functions that are available (from `core.qx`) is fairly limited. The
1054bulk of the functions you will add are described in step 9 and step A,
1055but you will begin to flesh them out over the next few steps to
1056support quoting (step 7) and macros (step 8).
0f4ca9d1
JM
1057
1058
45a8b3ca 1059#### Deferrable:
0f4ca9d1
JM
1060
1061* Add the ability to run another mal program from the command line.
1062 Prior to the REPL loop, check if your mal implementation is called
1063 with command line arguments. If so, treat the first argument as
1064 a filename and use `rep` to call `load-file` on that filename, and
1065 finally exit/terminate execution.
1066
1067* Add the rest of the command line arguments to your REPL environment
1068 so that programs that are run with `load-file` have access to their
a85b8454 1069 calling environment. Add a new "\*ARGV\*" (symbol) entry to your REPL
0f4ca9d1
JM
1070 environment. The value of this entry should be the rest of the
1071 command line arguments as a mal list value.
1072
1073
1074<a name="step7"></a>
1075
1076### Step 7: Quoting
1077
1078![step7_quote architecture](step7_quote.png)
1079
1080In step 7 you will add the special forms `quote` and `quasiquote` and
1081add supporting core functions `cons` and `concat`. The two quote forms
1082add a powerful abstraction for manipulating mal code itself
1083(meta-programming).
1084
0f4ca9d1
JM
1085The `quote` special form indicates to the evaluator (`EVAL`) that the
1086parameter should not be evaluated (yet). At first glance, this might
10a76646 1087not seem particularly useful but an example of what this enables is the
0f4ca9d1
JM
1088ability for a mal program to refer to a symbol itself rather than the
1089value that it evaluates to. Likewise with lists. For example, consider
1090the following:
1091
1092* `(prn abc)`: this will lookup the symbol `abc` in the current
1093 evaluation environment and print it. This will result in error if
1094 `abc` is not defined.
1095* `(prn (quote abc))`: this will print "abc" (prints the symbol
1096 itself). This will work regardless of whether `abc` is defined in
1097 the current environment.
1098* `(prn (1 2 3))`: this will result in an error because `1` is not
1099 a function and cannot be applied to the arguments `(2 3)`.
1100* `(prn (quote (1 2 3)))`: this will print "(1 2 3)".
1101* `(def! l (quote (1 2 3)))`: list quoting allows us to define lists
1102 directly in the code (list literal). Another way of doing this is
1103 with the list function: `(def! l (list 1 2 3))`.
1104
1105The second special quoting form is `quasiquote`. This allows a quoted
1106list to have internal elements of the list that are temporarily
1107unquoted (normal evaluation). There are two special forms that only
1108mean something within a quasiquoted list: `unquote` and
1109`splice-unquote`. These are perhaps best explained with some examples:
1110
1111* `(def! lst (quote (2 3)))` -> `(2 3)`
1112* `(quasiquote (1 (unquote lst)))` -> `(1 (2 3))`
1113* `(quasiquote (1 (splice-unquote lst)))` -> `(1 2 3)`
1114
1115The `unquote` form turns evaluation back on for its argument and the
1116result of evaluation is put in place into the quasiquoted list. The
1117`splice-unquote` also turns evaluation back on for its argument, but
1118the evaluated value must be a list which is then "spliced" into the
1119quasiquoted list. The true power of the quasiquote form will be
1c2eab90 1120manifest when it is used together with macros (in the next step).
0f4ca9d1 1121
ffd31966
JM
1122Compare the pseudocode for step 6 and step 7 to get a basic idea of
1123the changes that will be made during this step:
1124```
1125diff -urp ../process/step6_file.txt ../process/step7_quote.txt
1126```
1127
1128* Copy `step6_file.qx` to `step7_quote.qx`.
1129
0f4ca9d1
JM
1130* Before implementing the quoting forms, you will need to implement
1131* some supporting functions in the core namespace:
1132 * `cons`: this function takes a list as its second
1133 parameter and returns a new list that has the first argument
1134 prepended to it.
1135 * `concat`: this functions takes 0 or more lists as
1136 parameters and returns a new list that is a concatenation of all
1137 the list parameters.
1138
1139An aside on immutability: note that neither cons or concat mutate
1140their original list arguments. Any references to them (i.e. other
1141lists that they may be "contained" in) will still refer to the
1142original unchanged value. Mal, like Clojure, is a language which uses
1143immutable data structures. I encourage you to read about the power and
1144importance of immutability as implemented in Clojure (from which
1145Mal borrows most of its syntax and feature-set).
1146
1147* Add the `quote` special form. This form just returns its argument
1148 (the second list element of `ast`).
1149
1150* Add the `quasiquote` special form. First implement a helper function
1151 `is_pair` that returns true if the parameter is a non-empty list.
1152 Then define a `quasiquote` function. This is called from `EVAL` with
1153 the first `ast` argument (second list element) and then `ast` is set
1154 to the result and execution continues at the top of the loop (TCO).
1155 The `quasiquote` function takes a parameter `ast` and has the
1156 following conditional:
1157 1. if `is_pair` of `ast` is false: return a new list containing:
1158 a symbol named "quote" and `ast`.
1159 2. else if the first element of `ast` is a symbol named "unquote":
1160 return the second element of `ast`.
1c2eab90 1161 3. if `is_pair` of the first element of `ast` is true and the first
0f4ca9d1
JM
1162 element of first element of `ast` (`ast[0][0]`) is a symbol named
1163 "splice-unquote": return a new list containing: a symbol named
1164 "concat", the second element of first element of `ast`
1165 (`ast[0][1]`), and the result of calling `quasiquote` with the
1166 second through last element of `ast`.
1167 4. otherwise: return a new list containing: a symbol named "cons", the
1168 result of calling `quasiquote` on first element of `ast`
1c2eab90 1169 (`ast[0]`), and the result of calling `quasiquote` with the second
0f4ca9d1
JM
1170 through last element of `ast`.
1171
1172
1173Now go to the top level, run the step 7 tests:
1174```
e5737b08 1175make "test^quux^step7"
0f4ca9d1
JM
1176```
1177
1178Quoting is one of the more mundane functions available in mal, but do
1179not let that discourage you. Your mal implementation is almost
1180complete, and quoting sets the stage for the next very exiting step:
1181macros.
1182
1183
45a8b3ca 1184#### Deferrable
0f4ca9d1
JM
1185
1186* The full names for the quoting forms are fairly verbose. Most Lisp
1187 languages have a short-hand syntax and Mal is no exception. These
1188 short-hand syntaxes are known as reader macros because they allow us
1189 to manipulate mal code during the reader phase. Macros that run
1190 during the eval phase are just called "macros" and are described in
1191 the next section. Expand the conditional with reader `read_form`
1192 function to add the following four cases:
1193 * token is "'" (single quote): return a new list that contains the
1194 symbol "quote" and the result of reading the next form
1195 (`read_form`).
8f41f75a 1196 * token is "\`" (back-tick): return a new list that contains the
0f4ca9d1
JM
1197 symbol "quasiquote" and the result of reading the next form
1198 (`read_form`).
1199 * token is "~" (tilde): return a new list that contains the
1200 symbol "unquote" and the result of reading the next form
1201 (`read_form`).
1202 * token is "~@" (tilde + at sign): return a new list that contains
1203 the symbol "splice-unquote" and the result of reading the next
1204 form (`read_form`).
1205
1206* Add support for quoting of vectors. The `is_pair` function should
1207 return true if the argument is a non-empty list or vector. `cons`
1208 should also accept a vector as the second argument. The return value
1209 is a list regardless. `concat` should support concatenation of
1210 lists, vectors, or a mix or both. The result is always a list.
1211
1212
1213<a name="step8"></a>
1214
1215### Step 8: Macros
1216
1217![step8_macros architecture](step8_macros.png)
1218
a85b8454 1219Your mal implementation is now ready for one of the most lispy and
0f4ca9d1
JM
1220exciting of all programming concepts: macros. In the previous step,
1221quoting enabled some simple manipulation data structures and therefore
1222manipulation of mal code (because the `eval` function from step
12236 turns mal data into code). In this step you will be able to mark mal
1224functions as macros which can manipulate mal code before it is
1225evaluated. In other words, macros are user-defined special forms. Or
1226to look at it another way, macros allow mal programs to redefine
1227the mal language itself.
1228
1229Compare the pseudocode for step 7 and step 8 to get a basic idea of
1230the changes that will be made during this step:
1231```
1232diff -urp ../process/step7_quote.txt ../process/step8_macros.txt
1233```
1234
ffd31966
JM
1235* Copy `step7_quote.qx` to `step8_macros.qx`.
1236
1237
0f4ca9d1
JM
1238You might think that the infinite power of macros would require some
1239sort of complex mechanism, but the implementation is actually fairly
1240simple.
1241
1242* Add a new attribute `is_macro` to mal function types. This should
1243 default to false.
1244
1245* Add a new special form `defmacro!`. This is very similar to the
1246 `def!` form, but before the evaluated value (mal function) is set in
1247 the environment, the `is_macro` attribute should be set to true.
1248
1249* Add a `is_macro_call` function: This function takes arguments `ast`
1250 and `env`. It returns true if `ast` is a list that contains a symbol
1251 as the first element and that symbol refers to a function in the
1252 `env` environment and that function has the `is_macro` attribute set
1253 to true. Otherwise, it returns false.
1254
1255* Add a `macroexpand` function: This function takes arguments `ast`
1256 and `env`. It calls `is_macro_call` with `ast` and `env` and loops
1257 while that condition is true. Inside the loop, the first element of
1258 the `ast` list (a symbol), is looked up in the environment to get
1259 the macro function. This macro function is then called/applied with
1260 the rest of the `ast` elements (2nd through the last) as arguments.
1261 The return value of the macro call becomes the new value of `ast`.
1262 When the loop completes because `ast` no longer represents a macro
1263 call, the current value of `ast` is returned.
1264
1265* In the evaluator (`EVAL`) before the special forms switch (apply
1266 section), perform macro expansion by calling the `macroexpand`
1267 function with the current value of `ast` and `env`. Set `ast` to the
1268 result of that call. If the new value of `ast` is no longer a list
9b4cfe03
JM
1269 after macro expansion, then return the result of calling `eval_ast`
1270 on it, otherwise continue with the rest of the apply section
1271 (special forms switch).
0f4ca9d1
JM
1272
1273* Add a new special form condition for `macroexpand`. Call the
1274 `macroexpand` function using the first `ast` argument (second list
1275 element) and `env`. Return the result. This special form allows
1276 a mal program to do explicit macro expansion without applying the
1277 result (which can be useful for debugging macro expansion).
1278
1279Now go to the top level, run the step 8 tests:
1280```
e5737b08 1281make "test^quux^step8"
0f4ca9d1
JM
1282```
1283
8a98ef9a
JM
1284There is a reasonably good chance that the macro tests will not pass
1285the first time. Although the implementation of macros is fairly
1286simple, debugging runtime bugs with macros can be fairly tricky. If
1287you do run into subtle problems that are difficult to solve, let me
1288recommend a couple of approaches:
1289
1290* Use the macroexpand special form to eliminate one of the layers of
1291 indirection (to expand but skip evaluate). This will often reveal
1292 the source of the issue.
1293* Add a debug print statement to the top of your main `eval` function
1294 (inside the TCO loop) to print the current value of `ast` (hint use
1295 `pr_str` to get easier to debug output). Pull up the step8
1296 implementation from another language and uncomment its `eval`
1297 function (yes, I give you permission to violate the rule this once).
1298 Run the two side-by-side. The first difference is likely to point to
1299 the bug.
1300
1301Congratulations! You now have a Lisp interpreter with a super power
1302that most non-Lisp languages can only dream of (I have it on good
1303authority that languages dream when you are not using them). If you
1304are not already familiar with Lisp macros, I suggest the following
a85b8454 1305exercise: write a recursive macro that handles postfixed mal code
8a98ef9a
JM
1306(with the function as the last parameter instead of the first). Or
1307not. I have not actually done so myself, but I have heard it is an
a85b8454 1308interesting exercise.
8a98ef9a
JM
1309
1310In the next step you will add try/catch style exception handling to
1311your implementation in addition to some new core functions. After
1312step9 you will be very close to having a fully self-hosting mal
1313implementation. Let us continue!
1314
0f4ca9d1 1315
dc791440 1316#### Deferrable
0f4ca9d1
JM
1317
1318* Add the following new core functions which are frequently used in
1319 macro functions:
1320 * `nth`: this function takes a list (or vector) and a number (index)
1321 as arguments, returns the element of the list at the given index.
1322 If the index is out of range, this function raises an exception.
1323 * `first`: this function takes a list (or vector) as its argument
1324 and return the first element. If the list (or vector) is empty or
1325 is `nil` then `nil` is returned.
1326 * `rest`: this function takes a list (or vector) as its argument and
1327 returns a new list containing all the elements except the first.
1328
1329* In the main program, use the `rep` function to define two new
1330 control structures macros. Here are the string arguments for `rep`
1331 to define these macros:
1332 * `cond`: "(defmacro! cond (fn* (& xs) (if (> (count xs) 0) (list 'if (first xs) (if (> (count xs) 1) (nth xs 1) (throw \"odd number of forms to cond\")) (cons 'cond (rest (rest xs)))))))"
dd4020b9
JM
1333 * Note that `cond` calls the `throw` function when `cond` is
1334 called with an odd number of args. The `throw` function is
1335 implemented in the next step, but it will still serve it's
1336 purpose here by causing an undefined symbol error.
0f4ca9d1
JM
1337 * `or`: "(defmacro! or (fn* (& xs) (if (empty? xs) nil (if (= 1 (count xs)) (first xs) `(let* (or_FIXME ~(first xs)) (if or_FIXME or_FIXME (or ~@(rest xs))))))))"
1338
1339
ffd31966
JM
1340<a name="step9"></a>
1341
1342### Step 9: Try
1343
1344![step9_try architecture](step9_try.png)
1345
2345a3da
JM
1346In this step you will implement the final mal special form for
1347error/exception handling: `try*/catch*`. You will also add several core
10a76646
AS
1348functions to your implementation. In particular, you will enhance the
1349functional programming pedigree of your implementation by adding the
2345a3da
JM
1350`apply` and `map` core functions.
1351
ffd31966 1352Compare the pseudocode for step 8 and step 9 to get a basic idea of
dbac60df 1353the changes that will be made during this step:
ffd31966
JM
1354```
1355diff -urp ../process/step8_macros.txt ../process/step9_try.txt
1356```
1357
1358* Copy `step8_macros.qx` to `step9_try.qx`.
1359
2345a3da
JM
1360* Add the `try*/catch*` special form to the EVAL function. The
1361 try catch form looks like this: `(try* A (catch* B C))`. The form
1362 `A` is evaluated, if it throws an exception, then form `C` is
1c2eab90 1363 evaluated with a new environment that binds the symbol `B` to the
2345a3da
JM
1364 value of the exception that was thrown.
1365 * If your target language has built-in try/catch style exception
1366 handling then you are already 90% of the way done. Add a
1c2eab90 1367 (native language) try/catch block that evaluates `A` within
2345a3da
JM
1368 the try block and catches all exceptions. If an exception is
1369 caught, then translate it to a mal type/value. For native
1370 exceptions this is either the message string or a mal hash-map
1371 that contains the message string and other attributes of the
1c2eab90 1372 exception. When a regular mal type/value is used as an
2345a3da
JM
1373 exception, you will probably need to store it within a native
1374 exception type in order to be able to convey/transport it using
1375 the native try/catch mechanism. Then you will extract the mal
1376 type/value from the native exception. Create a new mal environment
1c2eab90 1377 that binds `B` to the value of the exception. Finally, evaluate `C`
2345a3da
JM
1378 using that new environment.
1379 * If your target language does not have built-in try/catch style
1380 exception handling then you have some extra work to do. One of the
1381 most straightforward approaches is to create a a global error
1382 variable that stores the thrown mal type/value. The complication
1383 is that there are a bunch of places where you must check to see if
1384 the global error state is set and return without proceeding. The
1385 rule of thumb is that this check should happen at the top of your
1386 EVAL function and also right after any call to EVAL (and after any
1387 function call that might happen to call EVAL further down the
1388 chain). Yes, it is ugly, but you were warned in the section on
1389 picking a language.
1390
cfdf00cc 1391* Add the `throw` core function.
2345a3da
JM
1392 * If your language supports try/catch style exception handling, then
1393 this function takes a mal type/value and throws/raises it as an
1394 exception. In order to do this, you may need to create a custom
1395 exception object that wraps a mal value/type.
1396 * If your language does not support try/catch style exception
1397 handling, then set the global error state to the mal type/value.
1398
1399* Add the `apply` and `map` core functions. In step 5, if you did not
1400 add the original function (`fn`) to the structure returned from
bd62ff74
JM
1401 `fn*`, the you will need to do so now.
1402 * `apply`: takes at least two arguments. The first argument is
1403 a function and the last argument is list (or vector). The
a85b8454 1404 arguments between the function and the last argument (if there are
bd62ff74
JM
1405 any) are concatenated with the final argument to create the
1406 arguments that are used to call the function. The apply
1407 function allows a function to be called with arguments that are
1408 contained in a list (or vector). In other words, `(apply F A B [C
1409 D])` is equivalent to `(F A B C D)`.
1410 * `map`: takes a function and a list (or vector) and evaluates the
1411 function against every element of the list (or vector) one at
1412 a time and returns the results as a list.
2345a3da
JM
1413
1414* Add some type predicates core functions. In Lisp, predicates are
1415 functions that return true/false (or true value/nil) and typically
bd62ff74
JM
1416 end in "?" or "p".
1417 * `nil?`: takes a single argument and returns true (mal true value)
1418 if the argument is nil (mal nil value).
1419 * `true?`: takes a single argument and returns true (mal true value)
1420 if the argument is a true value (mal true value).
1421 * `false?`: takes a single argument and returns true (mal true
1422 value) if the argument is a false value (mal false value).
1423 * `symbol?`: takes a single argument and returns true (mal true
1424 value) if the argument is a symbol (mal symbol value).
1425
1426Now go to the top level, run the step 9 tests:
1427```
e5737b08 1428make "test^quux^step9"
bd62ff74
JM
1429```
1430
1431Your mal implementation is now essentially a fully featured Lisp
1432interpreter. But if you stop now you will miss one of the most
1433satisfying and enlightening aspects of creating a mal implementation:
1434self-hosting.
2345a3da 1435
dc791440 1436#### Deferrable
2345a3da 1437
e37d9b49
JM
1438* Add the following new core functions:
1439 * `symbol`: takes a string and returns a new symbol with the string
1440 as its name.
1441 * `keyword`: takes a string and returns a keyword with the same name
1442 (usually just be prepending the special keyword
1443 unicode symbol). This function should also detect if the argument
1444 is already a keyword and just return it.
1445 * `keyword?`: takes a single argument and returns true (mal true
1446 value) if the argument is a keyword, otherwise returns false (mal
1447 false value).
1448 * `vector`: takes a variable number of arguments and returns
1449 a vector containing those arguments.
1450 * `vector?`: takes a single argument and returns true (mal true
1451 value) if the argument is a vector, otherwise returns false (mal
1452 false value).
1453 * `hash-map`: takes a variable but even number of arguments and
1454 returns a new mal hash-map value with keys from the odd arguments
1455 and values from the even arguments respectively. This is basically
1456 the functional form of the `{}` reader literal syntax.
1457 * `map?`: takes a single argument and returns true (mal true
1458 value) if the argument is a hash-map, otherwise returns false (mal
1459 false value).
1460 * `assoc`: takes a hash-map as the first argument and the remaining
1461 arguments are odd/even key/value pairs to "associate" (merge) into
1462 the hash-map. Note that the original hash-map is unchanged
1463 (remember, mal values are immutable), and a new hash-map
1464 containing the old hash-maps key/values plus the merged key/value
1465 arguments is returned.
1466 * `dissoc`: takes a hash-map and a list of keys to remove from the
1467 hash-map. Again, note that the original hash-map is unchanged and
1468 a new hash-map with the keys removed is returned. Key arguments
1469 that do not exist in the hash-map are ignored.
1470 * `get`: takes a hash-map and a key and returns the value of looking
1471 up that key in the hash-map. If the key is not found in the
1472 hash-map then nil is returned.
1473 * `contains?`: takes a hash-map and a key and returns true (mal true
1474 value) if the key exists in the hash-map and false (mal false
1475 value) otherwise.
1476 * `keys`: takes a hash-map and returns a list (mal list value) of
1477 all the keys in the hash-map.
1478 * `vals`: takes a hash-map and returns a list (mal list value) of
1479 all the values in the hash-map.
1480 * `sequential?`: takes a single arguments and returns true (mal true
1481 value) if it is a list or a vector, otherwise returns false (mal
1482 false value).
ffd31966
JM
1483
1484
8569b2af 1485<a name="stepA"></a>
ffd31966 1486
396d869e 1487### Step A: Metadata, Self-hosting and Interop
ffd31966 1488
90f618cb 1489![stepA_mal architecture](stepA_mal.png)
ffd31966 1490
bd62ff74
JM
1491You have reached the final step of your mal implementation. This step
1492is kind of a catchall for things that did not fit into other steps.
1493But most importantly, the changes you make in this step will unlock
1494the magical power known as "self-hosting". You might have noticed
1495that one of the languages that mal is implemented in is "mal". Any mal
1496implementation that is complete enough can run the mal implementation
1497of mal. You might need to pull out your hammock and ponder this for
1498a while if you have never built a compiler or interpreter before. Look
1499at the step source files for the mal implementation of mal (it is not
1500cheating now that you have reached step A).
1501
1502If you deferred the implementation of keywords, vectors and hash-maps,
1503now is the time to go back and implement them if you want your
1504implementation to self-host.
1505
ffd31966
JM
1506Compare the pseudocode for step 9 and step A to get a basic idea of
1507the changes that will be made during this step:
1508```
dbac60df 1509diff -urp ../process/step9_try.txt ../process/stepA_mal.txt
ffd31966
JM
1510```
1511
90f618cb 1512* Copy `step9_try.qx` to `stepA_mal.qx`.
ffd31966 1513
396d869e
JM
1514* Add the `readline` core function. This functions takes a
1515 string that is used to prompt the user for input. The line of text
1516 entered by the user is returned as a string. If the user sends an
1517 end-of-file (usually Ctrl-D), then nil is returned.
1518
dd4020b9
JM
1519* Add meta-data support to mal functions by adding a new metadata
1520 attribute on mal functions that refers to another mal value/type
1521 (nil by default). Add the following metadata related core functions:
1522 * `meta`: this takes a single mal function argument and returns the
1523 value of the metadata attribute.
1524 * `with-meta`: this function takes two arguments. The first argument
1525 is a mal function and the second argument is another mal
1526 value/type to set as metadata. A copy of the mal function is
1527 returned that has its `meta` attribute set to the second argument.
1528 Note that it is important that the environment and macro attribute
1529 of mal function are retained when it is copied.
bd62ff74 1530
33e37b68
JM
1531* Add a new "\*host-language\*" (symbol) entry to your REPL
1532 environment. The value of this entry should be a mal string
0198b7a2 1533 containing the name of the current implementation.
33e37b68
JM
1534
1535* When the REPL starts up (as opposed to when it is called with
1536 a script and/or arguments), call the `rep` function with this string
1537 to print a startup header:
16ca60da 1538 "(println (str \"Mal [\" \*host-language\* \"]\"))".
33e37b68 1539
bd62ff74 1540
8f41f75a 1541Now go to the top level, run the step A tests:
bd62ff74 1542```
e5737b08 1543make "test^quux^stepA"
bd62ff74
JM
1544```
1545
1546Once you have passed all the non-optional step A tests, it is time to
1547try self-hosting. Run your step A implementation as normal, but use
1548the file argument mode you added in step 6 to run a each of the step
1549from the mal implementation:
1550```
1551./stepA_mal.qx ../mal/step1_read_print.mal
1552./stepA_mal.qx ../mal/step2_eval.mal
1553...
1554./stepA_mal.qx ../mal/step9_try.mal
1555./stepA_mal.qx ../mal/stepA_mal.mal
1556```
1557
10a76646 1558There is a very good chance that you will encounter an error at some
bd62ff74
JM
1559point while trying to run the mal in mal implementation steps above.
1560Debugging failures that happen while self-hosting is MUCH more
1561difficult and mind bending. One of the best approaches I have
cfdf00cc 1562personally found is to add prn statements to the mal implementation
bd62ff74
JM
1563step (not your own implementation of mal) that is causing problems.
1564
1565Another approach I have frequently used is to pull out the code from
1566the mal implementation that is causing the problem and simplify it
1567step by step until you have a simple piece of mal code that still
1568reproduces the problem. Once the reproducer is simple enough you will
1569probably know where in your own implementation that problem is likely
1570to be. Please add your simple reproducer as a test case so that future
1571implementers will fix similar issues in their code before they get to
1572self-hosting when it is much more difficult to track down and fix.
1573
1574Once you can manually run all the self-hosted steps, it is time to run
1575all the tests in self-hosted mode:
1576```
e5737b08 1577make MAL_IMPL=quux "test^mal"
bd62ff74
JM
1578```
1579
1580When you run into problems (which you almost certainly will), use the
1581same process described above to debug them.
1582
1583Congratulations!!! When all the tests pass, you should pause for
1584a moment and consider what you have accomplished. You have implemented
1585a Lisp interpreter that is powerful and complete enough to run a large
1586mal program which is itself an implementation of the mal language. You
1587might even be asking if you can continue the "inception" by using your
1588implementation to run a mal implementation which itself runs the mal
1589implementation.
1590
4881701a 1591
dc791440 1592#### Optional: gensym
4881701a
DM
1593
1594The `or` macro we introduced at step 8 has a bug. It defines a
1595variable called `or_FIXME`, which "shadows" such a binding from the
1596user's code (which uses the macro). If a user has a variable called
1597`or_FIXME`, it cannot be used as an `or` macro argument. In order to
1598fix that, we'll introduce `gensym`: a function which returns a symbol
1599which was never used before anywhere in the program. This is also an
1600example for the use of mal atoms to keep state (the state here being
1601the number of symbols produced by `gensym` so far).
1602
1603Previously you used `rep` to define the `or` macro. Remove that
1604definition and use `rep` to define the new counter, `gensym` function
1605and the clean `or` macro. Here are the string arguments you need to
1606pass to `rep`:
1607```
1608"(def! *gensym-counter* (atom 0))"
1609
1610"(def! gensym (fn* [] (symbol (str \"G__\" (swap! *gensym-counter* (fn* [x] (+ 1 x)))))))"
1611
1612"(defmacro! or (fn* (& xs) (if (empty? xs) nil (if (= 1 (count xs)) (first xs) (let* (condvar (gensym)) `(let* (~condvar ~(first xs)) (if ~condvar ~condvar (or ~@(rest xs)))))))))"
1613```
1614
1615For extra information read [Peter Seibel's thorough discussion about
1616`gensym` and leaking macros in Common Lisp](http://www.gigamonkeys.com/book/macros-defining-your-own.html#plugging-the-leaks).
1617
1618
dc791440 1619#### Optional additions
bd62ff74 1620
dd4020b9
JM
1621* Add metadata support to other composite data types (lists, vectors
1622 and hash-maps), and to native functions.
50756494
DM
1623* Add the following new core functions:
1624 * `time-ms`: takes no arguments and returns the number of
a85b8454 1625 milliseconds since epoch (00:00:00 UTC January 1, 1970), or, if
50756494
DM
1626 not possible, since another point in time (`time-ms` is usually
1627 used relatively to measure time durations). After `time-ms` is
dd4020b9
JM
1628 implemented, you can run the performance micro-benchmarks by
1629 running `make perf^quux`.
50756494
DM
1630 * `conj`: takes a collection and one or more elements as arguments
1631 and returns a new collection which includes the original
1632 collection and the new elements. If the collection is a list, a
1633 new list is returned with the elements inserted at the start of
1634 the given list in opposite order; if the collection is a vector, a
1635 new vector is returned with the elements added to the end of the
1636 given vector.
396d869e
JM
1637 * `string?`: returns true if the parameter is a string.
1638 * `seq`: takes a list, vector, string, or nil. If an empty list,
1639 empty vector, or empty string ("") is passed in then nil is
1640 returned. Otherwise, a list is returned unchanged, a vector is
1641 converted into a list, and a string is converted to a list that
1642 containing the original string split into single character
1643 strings.
6440cc50
DM
1644* For interop with the target language, add this core function:
1645 * `quux-eval`: takes a string, evaluates it in the target language,
1646 and returns the result converted to the relevant Mal type. You
1647 may also add other interop functions as you see fit; Clojure, for
1648 example, has a function called `.` which allows calling Java
1649 methods. If the target language is a static language, consider
1650 using FFI or some language-specific reflection mechanism, if
1651 available. The tests for `quux-eval` and any other interop
1652 function should be added in `quux/tests/stepA_mal.mal` (see the
1653 [tests for `lua-eval`](../lua/tests/stepA_mal.mal) as an example).
ffd31966 1654
dd4020b9
JM
1655### Next Steps
1656
1657* Join the #mal IRC channel. It's fairly quiet but there are bursts of
1658 interesting conversation related to mal, Lisps, esoteric programming
1659 languages, etc.
1660* If you have created an implementation for a new target language (or
1661 a unique and interesting variant of an existing implementation),
1662 consider sending a pull request to add it into the main mal
1663 repository. The [FAQ](FAQ.md#add_implementation) describes general
1664 requirements for getting an implementation merged into the main
1665 repository.
1666* Take your interpreter implementation and have it emit source code in
1667 the target language rather than immediately evaluating it. In other
1668 words, create a compiler.
1669* Pick a new target language and implement mal in it. Pick a language
1670 that is very different from any that you already know.
1671* Use your mal implementation to implement a real world project. Many
1672 of these will force you to address interop. Some ideas:
1673 * Web server (with mal as CGI language for extra points)
1674 * An IRC/Slack chat bot
1675 * An editor (GUI or curses) with mal as a scripting/extension
1676 language.
1677 * An AI player for a game like Chess or Go.
1678* Implement a feature in your mal implementation that is not covered
1679 by this guide. Some ideas:
1680 * Namespaces
1681 * Multi-threading support
1682 * Errors with line numbers and/or stack traces.
1683 * Lazy sequences
1684 * Clojure-style protocols
1685 * Full call/cc (call-with-current-continuation) support
1686 * Explicit TCO (i.e. `recur`) with tail-position error checking