Ocaml assembly output – Andrew Birkett's blog

Having spend a lot of time recently looking at the assembly generated by the MS C++ compiler and Corman lisp, I was about to start investigating what style of assembly the ocaml compiler generates for typical constructs. Fortunately for me, some else has already got there first. It’s a pretty informative article, and goes a long way towards explaining why ocaml performs so well. Well, actually, the real reason is that Xavier is one clever cookie.

I read one of his early reports on the ZINC system when I was travelling around the world with iPAQ in tow, and the major performance gain at that point, if I remember correctly, concerned avoiding unnecessary construction of closures for curried function. So, even though function calls in ocaml are curried (ie. a “two argument function” is really a single argument function, which returns another single argument function) you don’t actually need to build up all the scaffholding for the intermediate function if you’re just going to immediately apply it to another value. This stops you building lots of intermediate closures on the heap. This was an innovation at the time, I imagine (in 1990).

The article also describes the boxing scheme used in ocaml, which uses the bottom bit to indicate whether a bit of data is an integer, or a pointer to a heap block. If all your heap blocks are word aligned, the bottom bit is redundant anyway so this is a neat efficient trick. [I know a few other unnamed people who should recognise such bittwiddling tomfoolery too 😉 ]

Last week at work, I started rewriting a small bit of code because I knew that there was a more elegant (and therefore more likely to be correct and maintainable) way to express what it was doing. Unfortunately, when I started editing the code I realised that I was thinking in ocaml! The “elegant way” required ocaml-style variants and pattern matching, but I was coding in C++. Eep! The closest C++ equivalent is almost a joke in constrast to the ocaml version.

Finally, to end this ocaml praise session, the internals of the compiler are elegant and clean. The various sections of the compiler are split out using ocaml’s powerful module facilities, and the functional style of programming (ie. minimal use of state) makes understanding the code a lot easier. By contrast, the internals of gcc are a hideous mess. Actually no, the internals of gcc *are* a hideous mess, period.

Mu, I’m somewhat paralysed with indecision regarding where I want to go with computer tools. I have so many ideas and things I want to try out, and I’ve seen so many great ideas consigned to the historical bit bucket. And also, despite the impression which this programming-only-blog might convey, there’s a million and one other non-computer things which I want to spend my time on. I think at some point I need to lower my ambition and focus on improving one particular thing, rather than riding along on a cascade of new ideas. But it’s annoying, because every day I see tools which I consider to be primitive and backwards compared to what is possible. I want to “do an Alan Kay” and burn the disks – look around, take what is good, and throw the rest away.