Andrew Birkett's blog – Page 11 – Thoughts of a software engineer

Technical debt (or, mortgages in Haskell)

Post author By Andrew
Post date 11/17/2009
6 Comments on Technical debt (or, mortgages in Haskell)

I recently got fed up trying to understand my mortgage using excel. After twenty minutes guddling with cells and individual values, I felt the need to create higher-level abstractions such as “mortgage” and “payment strategy”. I also wanted to create a list of possible repayment strategies and easily compare them to see how it affects the loan duration and total interest payed. This is possible in excel, but no fun.

So, fast-forward to the end of an evening’s hacking with Haskell. I now have hmortgage, a EDSL for expressing payment strategies and code which will expand out a mortgage into monthly steps, like this:

We are looking at loan of £1000.00 at 5.0% over 10y, which has required monthly payment of £10.58
Baseline:
        Total interest: £272.97 Total payments: £1272.97 Duration=10y 1m
Overpayment scenario "2 pm, 200 initial":
        Total interest: £132.09 Total payments: £1132.09 Duration=6y 3m
        Compared to baseline: interest=£-140.88, payments=£-140.88, duration=-3y 10m
For month 1, balance: £1000.00 -> £791.58       (interest: £4.16, payment: £212.58)
For month 2, balance: £791.58 -> £782.29        (interest: £3.29, payment: £12.58)
For month 3, balance: £782.29 -> £772.96        (interest: £3.25, payment: £12.58)
For month 4, balance: £772.96 -> £763.60        (interest: £3.22, payment: £12.58)

ie. if you overpay by £2 each month, and pay an initial lump sum of £200, you’ll save about £140 overall and will repay the mortgage nearly 4 years early.

There’s a few points of haskelly interest in this code, mostly inspired by stuff I read a few years ago – behaviors in FRP, and SPJ’s “composing contracts” paper.

Combinators for payment strategies

I have a few primitive payment strategies, which can be combined into more complex strategies:

monthlyPaymentsOf (100 Pounds)
lumpSumOf (100 Pounds)
lumpSumOf (100 Pounds) `after` (1 Year)
monthlyPaymentsOf (100 Pound) +. (lumpSumOf (100 Pounds) `after` (1 Year))

Shallow embedding of DSL

The dsl is a shallow embedding; it represents the monthly payment plan as a function from month-number to the payment amount, ie. Integer -> Currency. There’s a problem with this approach – the only thing you can do with a function is apply it to some arguments. This is fine for finding the payment for a particular month, but I would also like to derive a textual description of the payment plan – which isn’t possible with functions.

From stuff I’ve read previously, I think my two options are:

Lisp-like: Represent the payment schedule as data (ie. like an AST) and provide an eval function. This allows introspection into structure of the payment schedule. Code is data, data is code.
Arrow-like: The payment strategy could be a tuple of the function and a textual description. When strategies are combined, the combinator would merge the textual descriptions as well as producing new combined functions. I’m not totally convinced that the english language is ‘compositional’ in this way though – it might end up with really clumsy phrasing.

Crazy Lennart-inspired postfix operators

Initially, the only way I had to create a ‘Currency’ value was via the ‘pounds’ function. In haskell, the function precedes the argument, hence it looks like “pounds 20”. The source code would read nicer if I could write this as “20 pounds” like we do in english. I didn’t think this was possible in Haskell.

Then I remembered seeing Lennart Augustsson’s crazy embedding of BASIC into Haskell. In particular, he had code which looked like this:

runBasic $ do
  10 PRINT "HELLO"
  20 END

How the heck does that parse? It’s using ‘do’ notation, so “20 END” must have a type in the Monad class. But, as I understood things, “x y” means “apply the (function) value x to value y”. And “20” doesn’t look much like a function to me.

Digging into the source, I found this:

-- 10 END
instance Num (END -> Expr a) where
    fromInteger i c = ...

Hmm, interesting. This is saying that (some) function type can be treated as if it is “number like” and provides a mechanism for converting integer literals in source code to that type. I hadn’t fully appreciated this, but the Haskell Report says that numeric literals aren’t quite as literal as I expected – the literal integer value gets passed through ‘fromInteger’ and can therefore be made into any Numeric type.

So this code really says “Hey ghc, if you come across a “42” in the source code, you can turn that into a function if you need to”. In the BASIC example, the next thing on line 20 is “END”, a constructor for the type also called END. So, ghc will be looking to turn “42” into something that can be used as a function taking an argument of type END, and so it’ll call this instance of fromInteger.

Hurrah, I can use the same ‘trick’ to make my currencies look nicer:

data MONEY = Pounds | Pence

instance Num (MONEY -> Currency) where
  fromInteger i Pounds = C (i * 100)
  fromInteger i Pence = C i

Now I can say “42 Pounds” or “23 Pence”. The “42” will become a function with type MONEY -> Currency. The “MONEY” type is really just a tag – used to choose the parse but that’s it. The Pounds/Pence tags force the appropriate overloading of fromInteger to be chosen, and this will construct a Currency value (represented as number of pence, and using a simple wrapper constructor called C).

Is this better, or just “clever”? I’m not sure yet. It’s certainly easier to read. But I feel I’ve taken a step away from “pure haskell” into a slightly weird world. Still, if I were writing in lisp, I wouldn’t think twice about doing this kind of thing.

The actual app

Shocker, I’ve produced an app which is actually useful to me in “teh real world”. I have a big TODO list of stuff which will fit nicely into the app – time-varying interest rates, inflation predictions and NPV calculations. None of which, of course, I will ever actually get around to adding. But it’s still useful in its present state, so a win!

Here’s what the “summary” view says – it omits the montly breakdown and instead reports the overall savings possible via the different payment strategies:

We are looking at loan of £1000.00 at 5.0% over 10y, which has required monthly payment of £10.58
Baseline:
        Total interest: £272.97 Total payments: £1272.97 Duration=10y 1m
Overpayment scenario "2 pm, 200 initial":
        Total interest: £132.09 Total payments: £1132.09 Duration=6y 3m
        Compared to baseline: interest=£-140.88, payments=£-140.88, duration=-3y 10m
Overpayment scenario "2 pm only":
        Total interest: £216.50 Total payments: £1216.50 Duration=8y 1m
        Compared to baseline: interest=£-56.47, payments=£-56.47, duration=-2y
Overpayment scenario "200 initial":
        Total interest: £163.52 Total payments: £1163.52 Duration=7y 8m
        Compared to baseline: interest=£-109.45, payments=£-109.45, duration=-2y 5m
Overpayment scenario "400 initial":
        Total interest: £87.73 Total payments: £1087.73 Duration=5y 6m
        Compared to baseline: interest=£-185.24, payments=£-185.24, duration=-4y 7m
Overpayment scenario "200 after 2y":
        Total interest: £191.42 Total payments: £1191.42 Duration=7y 10m
        Compared to baseline: interest=£-81.55, payments=£-81.55, duration=-2y 3m
Overpayment scenario "400 after 2y":
        Total interest: £137.90 Total payments: £1137.90 Duration=5y 10m
        Compared to baseline: interest=£-135.07, payments=£-135.07, duration=-4y 3m

Eep, it’s 01:30 .. how did that happen? Stoopid jetlag …

Tags haskell

General

DRM fail

Post author By Andrew
Post date 11/11/2009

I recently got the NASA When We Left Earth DVDs, and I thought “great, I’ll be able to watch them during my Seattle trip”. So, I put them into the hotel DVD player tonight .. and got a “cannot play” error. Arg, the dvd’s will be region 2 (europe) and the player is region 1 (US). So, despite this being a completely legal, paid for copy of the DVDs which I brought with me, I cannot watch them! Grr!

General

Sleepy in Seattle

Post author By Andrew
Post date 11/9/2009

What does one do to stay awake in Seattle after getting up at 3am, 14 hours of travelling and an 8 hour timezone shift? In my case, grab a coffee (how native) and head to the awesome Seattle Public Library. Without much of a plan, apart from staying awake. As it turns out, I randomly stumbled upon an archive of Communications of the ACM dating back to the very first edition, in 1958.

1958 was a strange old world. Things considered newsworthy: buying a new computer (as in, ‘Foo University has purchased an IBM 456 with 2048 bytes of memory’), upgrading the memory in your existing computer (particularly when you are building said memory from scratch yourself). Other articles included puzzles similar to chess end-games – ie. implement [trivial operation] using only 6 bytes of IBM xyz machine code but without using any jump operations.

I skipped forward to November 1976, the month I was born. An article by Jim Gray on db locking in which it’s necessary to define the term ‘transaction’ explicitly. The previous month, there’s an early paper about texture mapping by Jim Blinn with lots of pretty pictures. Again, it’s enlightening to see ‘basic’ stuff being laboriously explained .. for example, why you get aliasing effects if you sample the texture naively. But wasn’t “basic stuff” back then; it was the frontier of knowledge.

Only the flight across, I was reading a biography of Oliver Heaviside. The book covers both his physics work and also the world, time and society that he lived in. In particular, it’s fascinating to read about how resistant (sic) “practical” electrical engineers were to the new-fangled mathematics-wielding theoreticians who had started to dominate the field. There were many vocal engineers who were quite sure that they didn’t need “all that maths stuff”.

For every success story celebrated and enshrined in today’s textbooks, there were many other forgotten voices arguing against that viewpoint in the publications of the day. I’m sad that almost every textbook I read at university missed out all of this rich tapestry – instead they provided a neatly cut-and-dried distillation, devoid of any human context. To me, real science was. and presumably still is, a process of muddling around in a sea of uncertainty and conflicting schools of opinion. I seemed to learn about the abstract scientific method (very useful!) but not so much about the day-to-day struggles of real scientists. Much later, I found my way to Thomas Kuhn and biographies of Faraday, Maxwell, Boltzmann, etc. And there I found a much more interesting picture, crucially explaining the ideas in their original context.

So it’s nice to be able to go back to the original sources and imagine what it might’ve been like to be a ‘computer person’ in 1958 .. to see what kind of ideas were thinkable in that time, to see who was prodding at the boundaries, and to see how much is recognizable today.

General

Coders at Work

The coders who are interviewed in “Coders at Work” all have interesting opinions, but it’s the recurring themes which have really grabbed me.

The first theme is that all of these people are humans. They’re might all be famous for doing X, but hardly any of them set out consciously to do X, and none of them did what they did because they knew it’d lead them to where they are today. When you read about how they got into their area, the recurring themes are serendipity and “I did it because it was fun”. As Simon Peyton Jones says explicitly, the important thing is just getting started – because once you get started there’s a million interesting things you could play with. I don’t want to downplay the cool stuff some of these guys have done, but it’s hugely enlightening to hear them talk in their own words and hear how “normal” they all are.

The other theme I noticed is that everyone lives in their own little niche. Very few people in the book seem to have a broad overview of computing and how it’s changed over the decades. In particular, you can see how people’s thinking is constrained by either the era in which they learned about computers, or by the particular area that they’ve specialized in. It’s refreshing to hear Simon Peyton Jones say that he doesn’t really have a deep understanding of OO programming, because he’s basically not done that much of it – he doesn’t knock it, either. It’s weird to hear Peter Deustch describe his dream language without him knowing that these ideas have already been tried in Haskell. It’s interesting to hear people who are famous as being ‘lisp guys’ or ‘smalltalk guys’ knocking ‘their own’ language. And it’s amazing to see the split between low-level and high-level thinkers. I’m biased, because I’m into programming languages, but few people commented on the extent to which your Preferred Language affects your modes of thinking – although the results are plain to see.

Finally, this book made me realise that I’ve been in this game for quite a long time now (I’m only 32!). Enough time for entire chapters of knowledge to have come and gone. Programming in assembler, gone! Well not totally; still useful for compiler backends, security exploits and such like. Manual memory allocation, gone! Well not totally, there’s still kernel development and embedded stuff. Segmented memory models, gone but back for a wee while with PAE. Implementing primitive datastructures, largely done for you! C++, gone (for me at least)! I spend so much time getting really good at it too, hmmph. I respect it for what it is, but there are much nicer ways to spend your life.

But that’s all fine. Technology reflects the era that it was born in. C made sense when memory was expensive and CPUs slow. Now virtual memory and VM’s make sense. When resources were sparse, conceptual clarify was sacrificed to gain performance. Now we usually don’t need to make that sacrifice. The abstractions which made sense for a 1990’s desktop GUI app aren’t the ones you need for a 2009 network-based distributed system.

Is history important? Only partly, I think. The high level lessons are certainly important, but the details aren’t. Do you need to be able to code up a red/black tree today? No. But I think a developer should have a deep appreciation for the distinction between interface and implementation – and you should understand how the implementation choices can affect you (as a user). Do you need to understand low level hardware/assembler? No, but the concepts and solutions which crop up at that level crop up elsewhere too, so it’s certainly not wasted knowledge. Do you need to learn smalltalk? Only really to learn the ‘lessons of smalltalk’ – to see what you can do with a reflective, late-bound, heavily interactive system.

All in all, I’m whimsical about the amount of technology water that’s passed under my particular bridge. Easy come, easy go, I am not the sum of my knowledge. I’m happy to keep absorbing new fun stuff as times change and mostly I’m quite happy to see the back of the declining technologies anyway! It’s comforting in a way to see that there’s no real wizards out there, there’s just people hacking away on stuff that they think is cool and being ready to recognize the insights and the epiphanies when they come by. Evolution (and some marketing $$$s) usually takes care of picking out those solutions which are suited to the present environment. And there’s always the interesting “superior but ignored” technologies hovering around in the wings.

Computing has only really been around for one lifetime. Most of the first-generation guys are still alive. It’s interesting to hear some of them reflecting on a life spend involved in this area. I guess I’m taking a moment to reflect on where I am.

“Coders at Work”: loving it

General

Unicode Scion for Haskell

Post author By Andrew
Post date 9/6/2009

Scion provides a way for emacs to chat to ghc, giving you nice squiggly red lines under your errors in realtime.

I thought I’d found a string encoding problem in scion, but after taking a long route (tcpdump etc) it turns out that it’s just a poor default. Out of the box, scion will assume that scion server (and in turn, ghc) speaks latin1. However, you can just add this to your .emacs to make it speak utf8 instead:

(setq scion-net-coding-system 'utf-8-unix)

Now your lambdas can be λ’s and you still get pretty error messages in scion!