Categories
Programming

Wind it up and let it go

I read a quote somewhere along the lines of “once you’ve tasted Smalltalk, nothing tastes the same”. Okay, that’s a parody of a beer advert – I can’t find the actual quote now. But I’ve tasted smalltalk and I’ve tasted lisp, and now writing in C++ gives me the shivers. Edit-compile-test cycle? No thank you! And here’s why …

Recently I’ve been working on a small application which slurps a huuge dataset into memory, processes it for a bit and then writes some measurements out to disk. Now, it just so happens that I don’t know exactly how to process the dataset to get the best results. Maybe I need to use an offset of 35. Or maybe it’d work better if I used 70. I don’t know until I’ve tried a few values to see what works.

  Dataset dataset("input.dat");
  Answer answer = Process(dataset, 35);
  cout << "The answer is " << answer << "\n";

Reading the dataset into memory takes a good thirty seconds or so. So, I run it with "35" as the parameter. Ooh, that's really terrible. So, change it to "50", recompile, run the application. Wait another thirty seconds for it to reload the dataset. Zzzz, this is silly. Why do I need to reload the dataset each time?

This is the problem with C++. There are two seperate worlds. You start in the "source code world" which is inhabited by variables, type and comments. Then you pass through the Portal Of Compilation to reach the "executable world" which is inhabited by data, stack frames, and heap blocks. You're not allow back into "source code world" unless you throw all of your possesions into a big black hole, upon which they vanish forever,

Instead, I'd write a little bit of code which loads up the dataset, then switch into "executable world" and run it. Then I'd say "just wait there" to my stack frames and the heap while I nip back into "source code world" and write the next bit of code to process the dataset in some way. Then I'd return to "executable world" to run that code, see what the result is and then nip back to tweak my parameter. The "executable world" would always stick around, even though I'm changing the code.

Now, if you've only coded in C/C++ this will seem crazy. These language are a bit like a little wind-up car toy. You spend a while winding up your program, and then to run it you put it on a table and let it go .. and it goes "wheeee" and careers off in some random direction for a while until something goes wrong. So, you pick it up, wind it up some more, and try again. Wheeeeeee, it goes off in a slightly different direction to meet a different fate.

With smalltalk or lisp, you're not just a passive observer who watches as the car races off to its doom. You are actually sitting inside the toy car, keeping it on course. If the car drifts off to the left a bit, you just correct it and continue on your way. You don't have to go all the way back to the start.

In smalltalk/lisp, there is no distinction between the "source code world" and the "executable world". There's just one world, in which you can edit code, run the code and use tools. In smalltalk the world consists of objects which pass messages to each other. In lisp, the world consists of cons and atoms, some of which are being eval'd. If you load some data into the world at 1pm, then you could write the code to process it at 2pm and it'll still be around. It's pretty rare to ever reboot the universe.

To be fair, any C++ programmer would point out that all I need to do is add a "What offset should I use?" prompt into the program and loop repeatedly so that you can try a value, look at the result and then try a new value. After all, if you don't know exactly which values you want at compile-time, you ask for them at runtime. You learned that when you wrote your first "Hello, (name)" program.

I take a different view of this. I don't want to be faced with a "what offset should I use?" prompt, to which I can only answer using the language of integers. What if I want to reply "use the average of 210 and 110"? Or maybe "use the largest prime number between 10 and 50". Really I want to be able to answer by writing code if need be. In fact, I don't even need to be asked "what offset should I use?" - just give me control and I'll call the function and pass the parameter. Even when I'm "running" my code, I still want to be programming. Just because I'm supplying some data at "runtime" shouldn't mean that I have to use some crippled input language.

Programming languages give you huge expressive power, but then we switch into "executable world" and we're left with a somewhat lame debugger which might let us evaluate "a + b", if we're very lucky. When you walk through the Portal Of Compilation into "executable world", you pass a sign which says "thou shalt not write any more code". On the way back, the Portal erases all your memories from "executable world" so that you start from scratch next time through.

Lisp got this right from the start. The heart of lisp is the read-eval-print loop, which reads a bit of code, runs it and prints the result. It's so useful and powerful. When you're running a lisp program, you are still in the lisp world. At any point, your program can say "what do you want to do next?" and you can type in some lisp code and it just executes it. That's all that the lisp toplevel does. But the great thing is that you can have that power and functionality at any time and anywhere in your application. The debugger is just another bit of lisp code, just like the code you're writing. It's all just lisp code, living in the same lisp world.

This occurred to me today because switching between "source code world" and "executable world" took nearly a minute. But, even if it only takes a few second, it's something which you're doing all the time as you write code, and the lost time quickly adds up. These gaps, when you just sit and wait, often make me drop the mental eggs which I'm juggling. By the time I've switched worlds, I'm left wondering what it was that I needed to do next.

I have the view that programming systems should be aimed at humans, not computers. Performance isn't a big worry since only very small sections of code are bottlenecks and need to be made efficient, and Moore's law is our constant friend. The seperation of "source code world" and "executable world" allows you to make your program run a little bit faster, but the cost to the human programmer is enormous.

I'll end on a happy pragmatic note. Even if you are stuck inside a C program you can gain the power of lisp (well, scheme) by embedding guile, the GNU scheme implementation. Just by writing a small amount of glue code, you can have the full power of a programming language available at runtime inside your application. If you have a minute to spare, have a look through the well-written tutorial.