Static typing vs. dynamic typing

The small corner of the blog world which I live in (need a name for that concept) has recently started debating dynamic typing vs static typing (and I’ve already refuted some claims). I’m of the view that static typing is a big benefit especially in large systems where no one person understands everything. As ever, there are pros and cons to both approaches, so here’s another braindump of ideas and opinions ..

Charles Miller make the point that a test may fail a long way from where the actual error is. Actually, in my experience this isn’t a big problem if you are running tests regularly. If the test passed an hour ago but fails now, then the bug must be in code you wrote in the last hours.

At work, I have set up an auto-build system which sits watching CVS for new commits. When something new appears, it does a complete system build and test and then emails the committer with a concise summary of any problems. This way, you get much of the benefit of rapid feedback while the context is still fresh in your mind, but you don’t need individual developers running fairly large regression test suites all the time. Plus, we can run tools like Purify on the autobuilder machine and get leak-checking and some error-checking without the hassle of requiring all developers to use it daily.

If the email comes back from the autobuild saying that a test has failed, you can quickly use cvsweb to see what code changes were made in that commit. We also have an automated commit script which does stuff like newsgroup postings, and that newsgroup posting contains a list of links to colorized diffs for the commit. I’m trying to get things to the point where any common question is a few clicks away. When did the build get so slow? Look in the build archives. Did Bob remember to check for null pointers when he change MyClass yesterday? Click on the newsgroup posting and look at the diffs.

Oh, I’m getting distracted. Brian Marick isn’t convinced by the “types as documentation” viewpoint. In statically typed languages, you can name variable according to their use rather than their type. Whereas in dynamically typed languages, you see a lot more variables called ‘aString’. Then again, considering the whole kaboodle have variables statically typed is only useful if you can easily get to that information when reading code. So, you need your development environment to provide tooltips which tell you type information when you hover over a variable name. This is particularly true in ocaml and other languages which do type inference. You don’t need to specify the types in the source, because the compiler can work them out for itself. But, to get the “types as documentation” benefit you then need a code browsers which can do the type inference on-the-fly and do tooltips.

But that’s not where “types as documentation” really shines. It’s when you start breaking up large systems using interfaces that you get a lot of benefit from static typing. Interfaces (abstract base classes in C++) act as documentation about the high-level structure of a large program. They make explicit where the boundaries of the black boxes in the program are. There’s no real equivalent in a runtime-typed language – this article talks about a runtime equivalent to interfaces in Smalltalk.

I should try and find a list of claimed “advantages of runtime typing”. I can see the advantages of good reflection systems, such as Smalltalk, but I don’t think that’s all that people are talking about. The “it’s boring having to write types in source, and I rarely make type errors anyway” argument doesn’t sway me much – use a language with type inference and you don’t have to write types but the compiler will still typecheck! If you want to be able to pass an assortment of Apples, Oranges and Bananas to a function which expects a Fruit but don’t want to get tangled up in inheritance graphs, ocaml seperates subtyping from inheritance. If the function just needs to call “Grow()” and “Cook()” methods on the Fruit then in ocaml any class which implements Grow() and Cook() would do.

One big advantage of Smalltalk and Lisp over C++ is the way in which you can patch code without restarting the program. This is possible in C++ too (edit-and-continue in DevStudio) but it’s a lot harder and not as flexible. But this has got me wondering how much of the ‘dynamic’ world you can feed into static languages like ocaml and C++. There’s no reason why objects in C++ can’t carry around their type too. Bjarne would throw his hands up and say “the main reason you have static typing is so you don’t have to do this and it runs faster”. But having better runtime types in C++ would allow decisions to be made long after compile-time. It would be heap-walking a lot easier, for one. It would also make embedding a lisp runtime easier, but that’s defeating the point! On a related note, if a dictionary of function names and addresses was available at runtime, you could do much better crash handlers. Some linker magic required, but it’s do-able.