High Integrity Compilation – Andrew Birkett's blog

The book High Integrity Compilation is keeping my brain engaged as I munch my lunchtime sandwiches. The basic plot concerns writing a bug-free compiler, and the cast consists of the semantics of the high-level language being compiled and the semantics of the target assembly language. Happily, it’s really quite easy going. My previous encounters with formal semantics have generally left me feeling out of my depth. In contrast, this book deals with source/target languages which are just complex enough to show interesting properties but simple enough to pick up immediately.

I’ll try to give an example of why this book makes things easier. The “meaning” of a computer language is usually formally defined by a mapping from “programs” to some kind of mathematical object, like a set. This is useful because we can then leverage all of our mathematics knowledge to learn stuff about behaviour of the program. Now, most books on semantics start with a paragraph like “we’d like to use boring old primary-school sets as our mathematical objects, but they’re not powerful enough”. And before you know it, they’ve starting using domain theory and fixed-points as if they were as common as making a cup of tea. Around this point, I usually head over to slashdot and never return. However, the “High Integrity Compilation” book happily doesn’t disappear down this road. It stays safely up in the land of simple maths and easy state machines. The languages under consideration are simple imperative ones, and so there’s not the same focus on recursive functions that you get with ML-like languages and so you’re not immediately hit by a wall of complexity on page four of the book.

This book also dutifully walks through useful examples, working up from simple three-liners into multi-page walkthroughs. There’s a refreshing lack of the words “obviously”, “clearly” and “trivially” and instead plenty of concrete worked examples.

Of course, this means that the resulting compiler isn’t going to really change the world. The source and target languages are chosen for pedagogical reasons rather than real-world usefulness. But this means that, for someone like me learning on their own, you can quickly get a good grounding in semantics without having to deal with a whole load of picky little uninteresting details.