Strict-by-default vs Lazy-by-default

In a lan­guage with lazy eval­u­ation, a value is com­puted only when it is needed, and then the com­puted value is stored. This is in con­trast with strict eval­u­ation (which is by far the most com­mon), where values are com­puted as soon as they are defined.

Haskell uses lazy eval­u­ation, and I think it’s one of the great strengths of the lan­guage. But the pre­valent opinion in the pro­gram­ming world is that it’s maybe a nice idea, but bad in prac­tice. After all, you can al­ways ex­pli­citly in­tro­duce lazi­ness in the form of func­tion calls when you need it. Well, I think it’s a better de­fault. So let’s start out with some pros of lazi­ness:

  • The lan­guage be­comes more uni­form.

I dis­like spe­cial cases in lan­guages, I like things to be nice and simple and uni­form. I ex­pect most pro­gram­mers would agree with me. And yet there are many in­stances of non-uni­formity which everyone seems happy with!

Con­sider the simple boolean and op­er­ator with short-­cir­cuit­ing, &&. Could you define this in a strict lan­guage? No. Boolean op­er­ators tend to have spe­cial eval­u­ation rules as­so­ci­ated with them (c­alled a spe­cial form in the Lisp world), be­cause the right ar­gu­ment is typ­ic­ally not eval­u­ated if the result can be de­term­ined purely from the left ar­gu­ment.

In Haskell, here is how you can define the short-­cir­cuiting boolean and:

(&&) :: Bool -> Bool -> Bool
True  && b = b
False && _ = False

An­other common spe­cial form is the if/then/else state­ment. This cannot be defined in a strict lan­guage for much the same reason: only one branch is eval­u­ated de­pending on the value of the con­di­tional. It’s just a normal func­tion given lazi­ness:

ifThenElse :: Bool -> a -> a -> a
ifThenElse True  t _ = t
ifThenElse False _ f = f

Lazi­ness lets you define new con­trol struc­tures which the de­signers of the lan­guage had not fore­seen, this makes the lan­guage much more flex­ible.

  • It does at most as much com­pu­ta­tion as would be done under strict eval­u­ation.

The “worst case” for lazy eval­u­ation is when all values are needed. Everything is eval­u­ated, so there is no saving over strict eval­u­ation. On the other hand, if even one value is not needed, less work has been done than would have been under strict eval­u­ation.

This ob­ser­va­tion comes from an early paper on lazy eval­u­ation, called CONS Should Not Eval­uate Its Ar­gu­ments, done in the con­text of Lisp.

  • Code can be more mod­u­lar.

Here’s an ex­ample of the New­ton-Raphson method of finding square roots, taken from Why Func­tional Pro­gram­ming Mat­ters. Firstly, we define a func­tion to re­peatedly apply a func­tion to a value and gather the res­ults into a list:

re­peat :: (a -> a) -> a -> [a]
re­peat f a = a : re­peat f (f a)

Oops, we’ve already moved out of the realm of what strict­ness can deal with. There’s no base case for the re­cur­sion in re­peat, so a strict lan­guage would just enter an in­finite loop.

Any­way, now we need a func­tion which, given some tol­er­ance and a list of ap­prox­im­a­tions, finds the first value where two suc­cessive ap­prox­im­a­tions don’t differ by more than that tol­er­ance:

within :: Double -> [Double] -> Double
within ep­silon (a:b:rest) = if abs (a - b) <= ep­silon
  then b
  else within ep­silon (b:rest)

And now the square root func­tion is al­most trivial:

sqrt a0 ep­silon n = within ep­silon (re­peat next a0) where
  next x = (x + n/x) / 2

It may not look like we’ve gained much, but ac­tu­ally we have: both re­peat and within can be re-used in other con­texts. In order to make the pro­gram mod­ular like this in a strict lan­guage, we would need to ex­pli­citly in­tro­duce lazi­ness in the form of a gen­er­ator. That’s more work.

The paper then goes on to re-use these func­tions to cal­cu­late the square root a slightly dif­ferent way; to im­ple­ment nu­mer­ical dif­fer­en­ti­ation; and to im­ple­ment nu­mer­ical in­teg­ra­tion. It turns out that within and re­peat are just the things you need to im­ple­ment nu­mer­ical al­gorithms.

  • Memoisa­tion (can be) free!

A common op­tim­isa­tion tech­nique is to store the result of a func­tion in some sort of lookup table, so if the func­tion is given the same ar­gu­ments, the result can be re­turned without needing to re­cal­cu­late it. Well, if you can rep­resent your func­tion as a data struc­ture, lazi­ness does that for you for free! Caching res­ults is what it is good for, after all.

We can use this to im­ple­ment a simple func­tion to get the nth prime in linear time after it is first cal­cu­lated:

primes :: [Int]
primes = ...

prime :: Int -> Int
prime n = primes !! n

The first time prime n is com­puted for some n, the primes list will be eval­u­ated up to that point and the value re­turned. If we ever call primes m, where m <= n, then all the func­tion does is tra­verse the list to find the value which has already been com­puted! This is a classic ex­ample of trading space for time, the lookup table uses memory, but we don’t need to do a po­ten­tially ex­pensive cal­cu­la­tion every time. The nice thing here is that this doesn’t re­quire any spe­cial sup­port for ex­plicit memoisa­tion; it’s just a con­sequence of lazy eval­u­ation.

Nat­ur­ally, there are dis­ad­vant­ages to lazy eval­u­ation as well:

  • If po­ten­tially uses more memory.

Lazi­ness is im­ple­mented by in­tro­du­cing thunks. A thunk is a pointer to some code and an en­vir­on­ment. When the value rep­res­ented by a thunk is de­man­ded, the code is eval­u­ated and the thunk re­placed with its res­ult. This gives us the eval­u­ation only when we need it and the cach­ing. When a value is needed im­me­di­ately, the thunk is just a waste of space.

Lazi­ness is a bit of a gamble; you’re making the judge­ment that the space saved by not needing to com­pute things will offset the space wasted by al­loc­ating these thunks.

  • It’s dif­fi­cult to pre­dict when memory is freed.

It’s just really hard to build up an in­tu­ition for this. Pro­filers are all but es­sen­tial if you get a non­trivial space leak. For­tu­nately, Haskell has pretty good sup­port for heap pro­filing, but it is cer­tainly much easier to debug space leaks in a strict lan­guage.

  • Re­fact­oring can have non-­local ef­fects.

A friend was re­fact­oring some non­trivial code he had part-in­her­ited part-writ­ten, and in doing so he changed the strict­ness of a func­tion slightly. This im­me­di­ately caused the pro­gram to crash with a pat­tern match failure in a com­pletely sep­arate mod­ule.

What!?

Cue a long and frus­trating de­bug­ging ses­sion, trying to figure out why this was hap­pening and how to fix it.

It turns out that the change in strict­ness was rip­pling back­wards through the rest of the pro­gram, and forced some­thing to be com­puted which wasn’t being be­fore. Some func­tion that had been written years ago by someone who had since left had a missing case in a pat­tern match. This had been fine, but now the result of ap­plying this func­tion to a value which matched that case was needed: hence the er­ror.

That is awful. In a strict lan­guage, the missing case would have been found when that code was first writ­ten, and would have been cor­rec­ted. In a lazy lan­guage, you get to dis­cover it months (or even years) after it was in­tro­duced and the ori­ginal au­thor is gone.

The Killer Fea­ture of Lazi­ness

You can make a lazy func­tion strict, but you can’t make a strict func­tion lazy without re­writing it.

And this, to me, is the reason why lazi­ness is the better de­fault.

Let’s say we have a lazy func­tion we want to make strict. Haskell provides a func­tion called seq, which eval­u­ates its first ar­gu­ment and re­turns its second. Given this, we can con­struct a func­tion to fully eval­uate a data struc­ture. Let’s keep it simple and op­erate on lists for now:

se­qList :: [a] -> b -> b
se­qList (a:as) b = a `seq` se­qList as b
se­qList [] b = b

De­manding the result of se­qList as b re­curses along the en­tire list (be­cause that is the ter­min­a­tion con­di­tion of the re­cur­sion), using seq to eval­uate each ele­ment1.

Give this, we can make an ar­bit­rary func­tion op­er­ating on lists com­pletely strict:

lazy­Func­tion :: [a] -> [b] -> [c] -> [d]
lazy­Func­tion foo bar baz = ...

strict­Func­tion :: [a] -> [b] -> [c] -> [d]
strict­Func­tion foo bar baz =
  let foo' = se­qList foo foo
      bar' = se­qList bar bar
      baz' = se­qList baz baz
      result = lazy­Func­tion foo' bar' baz'
  in foo' `seq` bar' `seq` baz' `seq` se­qList result result

When the result of strict­Func­tion is de­man­ded, both the ar­gu­ments and it will be fully eval­u­ated be­fore being re­turned. This is be­cause se­qList fully eval­u­ates its first ar­gu­ment be­fore re­turning the second, and we’re giving it the same list for both ar­gu­ments. A neat trick! In fact, this pat­tern is so useful that the deepseq pack­age, which provides util­ities for fully eval­u­ating data struc­tures, provides a func­tion to this ef­fect called force, which eval­u­ates its ar­gu­ment and re­turns it.

Caveat: The eval­u­ation of lazy­Func­tion will still al­locate thunks, which will then be im­me­di­ately forced by se­qList. This res­ults in churn in the heap and extra work for the garbage col­lector, which is bad. A better func­tion can be achieved if the lazy one is re­written with strict­ness in mind, as then these ex­cess thunks can be avoided. This is typ­ic­ally only an issue in per­form­ance-crit­ical code: for ex­ample the im­ple­ment­a­tions of data struc­tures. Most of the time you don’t need to worry about that extra bit of al­loc­a­tion.

We can even go for a middle-­ground, where the func­tion is strict in its result but po­ten­tially lazy (de­pends on how the func­tion is defined) in its ar­gu­ments:

stric­tish­Func­tion :: [a] -> [b] -> [c] -> [d]
stric­tish­Func­tion foo bar baz =
  let result = lazy­Func­tion foo bar baz
  in se­qList result result

This is more use­ful. Wanting to be strict in the ar­gu­ments like that is quite an un­common use-case in prac­tice.

Now let’s think about how to make a strict func­tion lazy without poking around in its im­ple­ment­a­tion. So we want some­thing of this form again:

strict­Func­tion :: [a] -> [b] -> [c] -> [d]
strict­Func­tion foo bar baz = ...

lazy­Func­tion :: [a] -> [b] -> [c] -> [d]
lazy­Func­tion foo bar baz =
  let result = strict­Func­tion foo bar baz
  in ??? result

Wait, that’s not right. Any­thing we pass in to strict­Func­tion will be fully eval­u­ated, be­cause it’s strict! So the func­tion can’t be that shape. We can’t alter the ar­gu­ments to strict­Func­tion to make the lazi­ness ex­plicit either, as that would change its type! We can wrap up the result of strict­Func­tion in a thunk, but as soon as we try to do any­thing with it, any ar­gu­ments passed in will be fully eval­u­ated.

We just can’t make it lazy.

Ad­dendum: Strict and Lazy Data Struc­tures

“If lazi­ness is so great, then why does pretty much every Haskell data struc­ture lib­rary provide both strict and lazy ver­sions of everything?”

A good ques­tion, but not as re­lated as it ap­pears at first glance. Lazy eval­u­ation is about func­tions, lazy data is about the rep­res­ent­a­tion of data struc­tures in memory. Using strict data struc­tures does not ne­ces­sarily make your pro­gram strict.

Sup­pose I have a strict string type:

silly :: [a] -> Strict­String
silly _ = "hello world"

Un­less the list type is also strict, this func­tion will work just fine on in­finite lists, even though the result is a totally strict value which is al­ways fully eval­u­ated.

The reason we have both strict and lazy ver­sions of data struc­tures is be­cause quite often you know that you will be using some­thing in a strict con­text, in which case you can write things in a way which will avoid al­loc­ating the use­less thunks. But not every use is com­pletely strict, in which case the lazy ver­sion is prefer­able.


  1. This is ac­tu­ally a slight lie, seq only eval­u­ates to “weak­-­head normal form”, which cor­res­ponds to the out­er­most con­structor of a data­type. We ac­tu­ally need to ap­pro­pri­ately define a se­qList-like func­tion for every type we want to be able to fully eval­u­ate. The deepseq package provides a type­class to handle this.

Date
Tags
haskell, programming
Target Audience
Haskell programmers.
Attention Conservation Notice
In every discussion of Haskell, the issue of strict vs lazy evaluation comes up. I think laziness is the better default.