Haskell is possibly too lazy for me

This is the first of several posts on the topic of Haskell’s laziness. After several weeks of playing, I’m coming to the conclusion that laziness-by-default is a hinderance rather than a virtue. Let’s start at the start though by trying to add some numbers together.

-- Non tail recursive; 5Mb of live objects at end.
mysum []     = 0
mysum (x:xs) = x + mysum xs
main = putStrLn $ show $ mysum $ take 100000 $ [1..]

As the comment says, this is a dumb version. It consumes 5Mb of memory because it’s not tail recursive.

Incidentally, after causing my machine to thrash several time during my experiments, I found it useful to use ‘ulimit’ to restrict the maximum heap size available to the process. Also, you can pass extra args to your haskell app to get it to report real-time memory stats, like this:

ghc sum.hs && /bin/bash -c 'ulimit -Sv 100000; ./a.out +RTS  -Sstderr'

Anyhow, the memory blowup is easy to fix; just pass an ‘accumulator’ parameter when you do the recursive call:

-- Tail recursive, but 3.5Mb of live objects at end.
mysuma acc []     = acc
mysuma acc (x:xs) = mysuma (acc+x) xs
main = putStrLn $ show $ mysuma 0 $ take 100000 $ [1..]

Hmm, it’s now tail recursive but it still consumes 3.5Mb? This is where Haskell’s laziness makes things quite different from ocaml and other strict languages. When we pass the accumulated value, haskell does not actually evaluate the addition prior to making the recursive call. It will delay the computation until its value is actually required. So, on each recursive call, the accumulator looks like an unevaluated “1+2” and then “1+2+3” etc.

We can fix this by explicitly telling haskell to evaluate the addition prior to making the call:

-- Tail recursive, with 'seq' to force immediate evaluation of addition. 
-- 40k of live objects at end.
mysumas acc []     = acc
mysumas acc (x:xs) = (acc+x) `seq` mysumas (acc+x) xs
main = putStrLn $ show $ mysumas 0 $ take 100000 $ [1..]

Finally we have a program which only consumes a tiny amount of heap space. But it took a surprising amount of effort. There’s lots more information about this situation on the haskell wiki.

RAID-1 sans disks

In preparation for a somewhat more dramatic future experiment, I’ve been trying out RAID-1 failure modes using linux’s loopback capabilities to avoid having to actually buy any more real hard drives. You can simulate drive failures, failover and easily see what the current disk contents are. It should go without saying that, if you have a real RAID system currently running, you probably don’t want to execute these commands without thinking a bit first:

# Creating and destroying disks from the safety of your own console
mkdir ~/raid; cd ~/raid

# Create two 10Mb files called disk0 and disk1
for d in 0 1; do dd if=/dev/zero of=disk${d} bs=1024 count=10240; done

# Make them available as block devices using the loopback device
for d in 0 1; do sudo losetup /dev/loop$d disk$d; done

# Combine the two 'disks' into a RAID-1 mirrored block device
# Using '--build' rather than '--create' means there is no device
# specific metadata, and so the contents of the disks will be identical
sudo mdadm --build --verbose /dev/md0 --level=1 --raid-devices=2 /dev/loop[01]

# Create a filesystem on our raid device and mount it
sudo mkfs.ext3 /dev/md0 
mkdir /tmp/raidmnt
sudo mount /dev/md0 /tmp/raidmnt
sudo chown $USER /tmp/raidmnt

# The contents of both disks change in unison
md5sum disk[01]
date > /tmp/raidmnt/datefile
sync
md5sum disk[01]

# If we mark one disk as failed, disk contents diverge
sudo mdadm --fail /dev/md0 /dev/loop0
date > /tmp/raidmnt/datefile
sync
md5sum disk[01]

# Remove the failed disk and readd it, and RAID1 will sync
sudo mdadm --remove /dev/md0 /dev/loop0
sudo mdadm --add /dev/md0 /dev/loop0
sleep 1
md5sum disk[01]

# Add a third (unused) disk into the system to test failover
dd if=/dev/zero of=disk2 bs=1024 count=10240
sudo losetup /dev/loop2 disk2
sudo mdadm --add /dev/md0 /dev/loop2
sudo mdadm --detail /dev/md0

# When one of original two disks fail, the new disk gets used
md5sum disk[012]
sudo mdadm --fail /dev/md0 /dev/loop0
date > /tmp/raidmnt/datefile
sync
md5sum disk[012]

# Tidy up the world
sudo umount /dev/md0
sudo mdadm -S /dev/md0
for x in /dev/loop[012]; do sudo losetup -d $x; done
rm -rf /tmp/raidmnt ~/raid