Searching for LISP-based implementations of different programming languages is a task akin to shooting a mixture of metaphors and pre-hatched chickens in a barrel.
Let's see who Google brings to this post based on that sentence.
I think I've written already that next semester I'll be the GSI (graduate student
instructor -- a UC-ism for teaching assistant
) for the local compilers
class, CS 164. The plan is to implement everything in Common Lisp. That means
that all the compiler stages (scanner, parser, optimizer, code generator) will be
written in Lisp, as will be the virtual machine on which the compiled
programs are to run. The language that we're compiling, though, is going to be a
Java variant.
I'm looking forward to using Lisp for this class. The last time I used Lisp for
very much was when I took an artificial intelligence course as an undergraduate,
but I have more than a passing familiarity with it. Indeed, I think that someone
who graduates with a degree in computer science and isn't familiar with
Lisp and Forth has been robbed. These two languages are fundamental. The fact
that they are so fundamental is actually the source of my current frustration,
but let me return to that after I explain briefly what I mean when I say that
these languages are fundamental.
Like human languages, programming languages have grammar. The grammmar
of a programming language describes the types of constructions which make up
acceptable programs. Like human languages, the grammar is far from the whole
picture; beyond the syntax, or the structure of an utterance, there is
the semantics, or the formal meaning of the utterance. The
sentence The blue flu blogs at midnight
is syntactically valid,
but semantically questionable. Programming language syntax is usually described
in BNF (Backus-Naur form -- you may insert your own pun on Backus-Bacchus here).
It's easiest to describe the notation by example. Suppose I wanted to write
down the syntax of simple arithmetic expressions involving plus, minus, times,
and divide. I might write the grammar like this:
expression := addend | expression + addend | expression - addend addend := multiplicand | addend * multiplicand | addend / multiplicand multiplicand := number | variable | ( expression )
In words, what I've written is that an expression consists of a sums and
differences of addends; an addend consists of products and quotients of
multiplicands; and a multiplicand is a number, a variable, or a parenthesized
expression. So if I wanted to decide whether the expression 1 + 2*(3+4)
was valid, I'd have to see if I could describe it according to these rules.
The description of the expression in terms of syntactic rules gives us a
parse tree, which is sort of like a grade school sentence diagram,
but more useful. Since I'm limited by HTML, let me use the following convention
for writing the parse tree above: when I use a particular rule, I'll write
the name of the thing I produce first, then the list of inputs to the rule;
and I'll wrap the whole thing in parentheses to prevent confusion about where
I am. In that case, the parse tree for 1 + 2*(3+4)
is
(expression (expression (addend (multiplicand (number 1)))) + (addend (multiplicand (number 2)) * (multiplicand oparen (expression (expression (addend (multiplicand (number 3)))) + (addend (multiplicand (number 4)))) cparen)))
I say that I write things this way because of the limitations of HTML, but even if I were a little more willing to draw sketches for you, I might decide to write the parse tree in the above way. Why? Because it's convenient for computer manipulation, and people have developed powerful tools for dealing with this sort of representation of expressions. It will not surprise you to learn that the above expression is a sort of proto-Lisp.
Of course, if I actually wanted to write the above expression in any dialect of Lisp that someone would use, it would look like
(+ 1 (* 2 (+ 3 4)))
Nevertheless, the point remains: in essence, a Lisp programmer writes a direct representation of the syntax trees for his programs. And trees are things that computer scientists are happy manipulating as data, which means that Lisp programmers can easily write programs that analyze and rewrite programs. And a compiler is essentially a tool for analyzing and rewriting programs, so lots of compiler writers use Lisp, or something very Lisp-like, inside their compiler.
Forth is what you get by turning Lisp on its head, so that instead of writing (+ 1 1) you write 1 1 +. The Forth model will be familiar to anyone who has used an HP calculator or looked at the specs for the Java Virtual Machine.
The problem: since Lisp and Forth are so damn easy to parse, everyone writes parsers for them -- and Google isn't smart enough to distinguish writing about parsers written in Lisp for other languages from writing about parsers written in other languages for Lisp. The latter are ubiquitous, the former a little less so -- hence my problem.
Since my battery is running low, ruminations on the resolution to this problem will have to wait for another day.