Starforth: A Minimal Self-Hosting Compiler

This is going to be the beginning of a series of articles on writing a self-hosting compiler. This is an introductory post. No source code or actual development is going to be discussed in this installment, or possibly not even in the next one. But we’ll get there…hopefully!

EDIT: Second installment is now available.

I’ve always been vaguely aware of Forth. It’s the language that works with only a stack. Which also means it uses reverse polish notation, that is, instead of 2 + 2, you write 2 2 +.

My (inaccurate) knowledge of Forth ended there. But then recently I was reading up on Forth and I realized something. Here’s how you define a variable named foo in Forth:

variable foo

Here’s how you write 100 to foo:

100 foo !

And for reading it:

foo @

@ and ! are called “fetch” and “store” in Forth nomenclature. I sorta understood why you’d need “store”, but what about “fetch”? You don’t need an operator to read variables in other languages, do you? Also, if @ reads a variable, and puts its value on the stack, what exactly does foo put on the stack?

A bit of experimentation, and it dawned on me: foo puts the address of the variable on the stack, and @ reads an address and de-references it. Forth has pointers!

Just like that, Forth found its place in the language hierarchy I keep in my head. If Lisp, is the highest level language, Forth is the lowest-level one (except for assembly, obviously).

I liked that so much, that I decided it’s time I start a new compiler project. I’m gonna call it Starforth. For no reason at all!

Goals

I set these goals for myself:

And here are some decisions to make things more concrete:

Upcoming Series

I’ve already written Starforth at the time of this writing. You can find the source code on Sourcehut. I plan to write a series of articles about the development process, starting from my earliest iterations and going forward from there. I’ll be using my memory, which is hopefully still fresh, and my trusty git logs.



This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.