How lazy evaluation forced Haskell to be pure

Question

I remember seeing a presentation in which SPJ said that lazy evaluation forced them to keep Haskell pure (or something along that line). I often see many Haskellers saying the same.

So, I would like to understand how lazy evaluation strategy forced them to keep Haskell pure as opposed to a strict evaluation stragegy ?

If you have lazy evaluation, it means any impure actions happen in random order. That kinda stops people trying to sneak impure actions in; actions that happen in random order aren't very useful. — MathematicalOrchid, Jul 17 '15 at 13:38
@MathematicalOrchid I don't get that... Haskell already separates impure actions in the IO monad so they are sequenced independently of the evaluation strategy... — Bakuriu, Jul 17 '15 at 13:46
@Bakuriu Haskell separates impure actions into the IO monad *because* otherwise their ordering would be completely arbitrary, which is *because* of lazy evaluation. If it weren't for lazy evaluation, you could just "cheat" and write pretend functions which are actually impure. But because of laziness, it's hopeless to try, no matter how tempted you might be. — MathematicalOrchid, Jul 17 '15 at 13:50
Anyway one thing about lazy vs strict evaluation is that in lazy evaluation you are able to think more compositionally (i.e. in a more declarative/functional way) and obtain an *efficient* solution. Using a strict evaluation strategy this may not be possible. For example with lazy evaluation you can do things like: `getFirstTrue p = head . filter p` without worrying that it will traverse the whole list before returning the first element. If you use strict evaluation you want to avoid this thing. This become more important with things like folds/maps etc mixed together. — Bakuriu, Jul 17 '15 at 13:50
@MathematicalOrchid From my point of view Io makes sense to separate pure vs impure *independently* of the evaluation strategy. It's smart to do so even in a strict language. The fact that it solves even that problem is just a plus. — Bakuriu, Jul 17 '15 at 13:54
@Bakuriu Separating them out is a very smart move. My point is that lazy evaluation kind of *forces* you to do it, whereas in a strict language you might be tempted to bypass the separation "just this one time", which slowly leads to being "all the time". And I think that's what the sentiments the OP is asking about refers to. — MathematicalOrchid, Jul 17 '15 at 13:57
@Sibi It's not so much that lazy evaluation *led* to purity; rather, lazy evaluation forced them to keep Haskell pure. For information, at some stage in [Software Engineering Radio podcast #108](http://www.se-radio.net/2008/08/episode-108-simon-peyton-jones-on-functional-programming-and-haskell/), SPJ explains why "purity kept us pure". — jub0bs, Jul 17 '15 at 14:32
@Jubobs Sorry, I should have worded it properly. I would like to understand how lazy evaluation forced them to keep Haskell pure. I will edit the question. — Sibi, Jul 17 '15 at 14:48

score 10 · Answer 1 · edited Sep 23 '20 at 15:30

It's not so much that lazy evaluation led to purity; Haskell was pure to begin with. Rather, lazy evaluation forced the designers of the language to keep the language pure.

Here is a relevant passage from the article A History of Haskell: Being Lazy with Class:

Once we were committed to a lazy language, a pure one was inescapable. The converse is not true, but it is notable that in practice most pure programming languages are also lazy. Why? Because in a call-by-value language, whether functional or not, the temptation to allow unrestricted side effects inside a “function” is almost irresistible.

Purity is a big bet, with pervasive consequences. Unrestricted side effects are undoubtedly very convenient. Lacking side effects, Haskell’s input/output was initially painfully clumsy, which was a source of considerable embarrassment. Necessity being the mother of invention, this embarrassment ultimately led to the invention of monadic I/O, which we now regard as one of Haskell’s main con- tributions to the world, as we discuss in more detail in Section 7.

Whether a pure language (with monadic effects) is ultimately the best way to write programs is still an open question, but it certainly is a radical and elegant attack on the challenge of programming, and it was that combination of power and beauty that motivated the designers. In retrospect, therefore, perhaps the biggest single benefit of laziness is not laziness per se, but rather that laziness kept us pure, and thereby motivated a great deal of productive work on monads and encapsulated state.

(my emphasis)

I also invite you to listen to 18'30'' of Software Engineering Radio podcast #108 for an explanation by the man himself. And here is a longer but relevant passage from SPJ's interview in Peter Seibel's Coders at Work:

I now think the important thing about laziness is that it kept us pure. [...]

[...] if you have a lazy evaluator, it’s harder to predict exactly when an expression is going to be evaluated. So that means if you want to print something on the screen, every call-by-value language, where the order of evaluation is completely explicit, does that by having an impure “function”—I’m putting quotes around it because it now isn’t a function at all—with type something like string to unit. You call this function and as a side effect it puts something on the screen. That’s what happens in Lisp; it also happens in ML. It happens in essentially every call-by-value language.

Now in a pure language, if you have a function from string to unit you would never need to call it because you know that it just gives the answer unit. That’s all a function can do, is give you the answer. And you know what the answer is. But of course if it has side effects, it’s very important that you do call it. In a lazy language the trouble is if you say, “f applied to print "hello",” then whether f evaluates its first argument is not apparent to the caller of the function. It’s something to do with the innards of the function. And if you pass it two arguments, f of print "hello" and print "goodbye", then you might print either or both in either order or neither. So somehow, with lazy evaluation, doing input/output by side effect just isn’t feasible. You can’t write sensible, reliable, predictable programs that way. So, we had to put up with that. It was a bit embarrassing really because you couldn’t really do any input/output to speak of. So for a long time we essentially had programs which could just take a string to a string. That was what the whole program did. The input string was the input and result string was the output and that’s all the program could really ever do.

You could get a bit clever by making the output string encode some output commands that were interpreted by some outer interpreter. So the output string might say, “Print this on the screen; put that on the disk.” An interpreter could actually do that. So you imagine the functional program is all nice and pure and there’s sort of this evil interpreter that interprets a string of commands. But then, of course, if you read a file, how do you get the input back into the program? Well, that’s not a problem, because you can output a string of commands that are interpreted by the evil interpreter and using lazy evaluation, it can dump the results back into the input of the program. So the program now takes a stream of responses to a stream of requests. The stream of requests go to the evil interpreter that does the things to the world. Each request generates a response that’s then fed back to the input. And because evaluation is lazy, the program has emitted a response just in time for it to come round the loop and be consumed as an input. But it was a bit fragile because if you consumed your response a bit too eagerly, then you get some kind of deadlock. Because you’d be asking for the answer to a question you hadn’t yet spat out of your back end yet.

The point of this is laziness drove us into a corner in which we had to think of ways around this I/O problem. I think that that was extremely important. The single most important thing about laziness was it drove us there.

(my emphasis)

@MaxBarraclough Thanks. I've found another copy on Microsoft's servers, and I fixed the broken link. — jub0bs, Sep 23 '20 at 11:26

score 8 · Accepted Answer · answered Jul 17 '15 at 20:05

I think the answer by Jubobs already sums it up nicely (with good references). But, in my own words, what I think SPJ and friends are referring to is this:

Having to go through this "monad" business can be really inconvenient at times. The sheer volume of questions on Stack Overflow asking "how do I just remove this IO thing?" is testament to the fact that sometimes, you really really want to just print out this one value right here — usually for the purposes of figuring out what the actual **** is going on!

In an eager language, it would be sorely tempting to just start adding magic impure functions that let you do impure stuff directly, like in other languages. No doubt you'd start with small things at first, but slowly you slide down this slippery slope, and before you know it, effects are all over the place.

In a lazy language like Haskell, this temptation still exists. There are many times when it would be really helpful to be able to just quickly sneak this one little effect in here or there. Except that because of laziness, adding effects turns out to be almost utterly useless. You can't control when anything happens. Even just Debug.trace tends to produce utterly incomprehensible results.

In short, if you're designing a lazy language, you really are forced to come up with a coherent story for how you're handling effects. You can't just go "meh, we'll pretend this function is just magic"; without the ability to control effects more precisely, you'll instantly end up in a horrible mess!

TL;DR In an eager language, you can get away with cheating. In a lazy language, you really have to do things properly, or it just plain doesn't work.

And that is why we hired Alex — wait, wrong window...

From another perspective, the cheating is much more *constrained*, and therefore much more *interesting*. It often takes years before a Haskell programmer develops a decent sense of *when* and *how* to cheat! — dfeuer, Sep 23 '20 at 17:36

score 2 · Answer 3 · edited May 23 '17 at 11:53

It depends by what you mean for "pure" in this context.

If for pure you mean as in purely functional, then what @MathematicalOrchid is true: with lazy evaluation you wouldn't know in which sequence impure actions are executed and hence you wouldn't be able to write meaningful programs at all and you are forced to be more pure (using the IO monad).

However I find this not really satisfying in this case. A true functional language already separates pure and impure code, so even a strict one should have some kind of IO.
However it's possible that the statement is more broad and refers to pure in the fact that you are able to express more easily, in a more declarative, compositional and yet efficient way the code.

Looking at this answer which makes exactly the statement you are citing it links to the article Why Functional Programming Matters by Hughes which is probably the one you are talking about.

The paper shows how higher order functions and lazy evaluation allow to write more modular code. Note that it does not say anything about purely functional etc. The point it is making is about being more modular and more declarative without losing in efficiency.

The article provides some examples. For example the Newton-Raphson algorithm: in a strict language you have to combine together the code that computes the next approximation and the one that checks if you obtained a good enough solution.

With lazy evaluation you can generate the infinite list of approximations in a function and call this from an other function that returns the first good enough approximation.

In Thinking Functionally with Haskell Richard Bird makes precisely this point. If we look at Chapter 2, exercise D:

Beaver is an eager evaluator, while Susan is a lazy one.

[...]

What alternative might Beaver prefer to head . filter p . map f?

And the answer says:

[...] Instead of defining first p f = head . filter p . map f, Beaver might define
first :: (b -> Bool) -> (a -> b) -> [a] -> b
first p xs | null xs = error "Empty list"
           | p x = x
           | otherwise = first p f (tail xs)
           where x = f (head xs)
The point is that with eager evaluation most functions have to be defined using explicit recursion, not in terms of useful component functions like map and filter.

So pure here means that allows declarative, compositional and yet efficient definitions while with eager evaluation using declarative and compositional definitions can lead to unnecessarily inefficient code.

I will inform the Lispers, MLers, and Discipline Disciplers that their preferred languages aren't "true" functional languages. I think you just get the historical utility and role of functional purity wrong. — Jonathan Cast, Jul 17 '15 at 15:05
@jcast Well, it depends on what you mean by functional language. My point is that Haskell was designed to be *very* functional and the biggest step in this direction is utter separation between side effects and evaluation of expressions. So, yes I'd definitely say that those languages are less "purely-functional" than Haskell, yet I do *not* think that Haskell is more pure just because it is lazy evaluated. — Bakuriu, Jul 17 '15 at 16:13
@Bakuriu They are less purely functional _languages_ because there is no language support for writing pure code. They do _allow_ you to write purely functional code though; And it is in fact how significant portions of code are written. — Cubic, Jul 17 '15 at 19:54
@Cubic Indeed. C also allows you to write purely functional code.There's not a lot of pure functional C around. — AndrewC, Jul 20 '15 at 00:22

score 1 · Answer 4 · answered Jul 17 '15 at 15:14

Strictly speaking, this claim isn't true, because Haskell has unsafePerformIO, which is a big hole in the language's functional purity. (It exploits a hole in GHC Haskell's functional purity which ultimately goes back to the decision to implement unboxed arithmetic by adding a strict fragment to the language). unsafePerformIO exists because the temptation to say "well, I'll implement just this one function using side-effects internally" is irresistible to most programmers. But if you look at the drawbacks of unsafePerformIO[1], you see exactly what people are saying:

unsafePerformIO a is not guaranteed to ever execute a.
Nor is it guaranteed to execute a only once, if it does execute.
Nor is there any guarantee about the relative ordering of the I/O performed by a with other parts of the program.

Those drawbacks keep unsafePerformIO mostly limited to the safest and most careful uses, and are why people do use IO directly until it becomes too inconvenient.

[1] Besides the type-unsafety; let r :: IORef a; r = unsafePerformIO $ newIORef undefined gives you a polymorphic r :: IORef a, which can be used to implement unsafeCast :: a -> b. ML has a solution for reference allocation that avoids that, and Haskell could have solved it in a similar way if purity wasn't considered desirable anyway (the monomorphism restriction is nearly a solution anyway, you just have to ban people from working around it by using a type signature, like I did above).

Most uses of `unsafePerformIO` that I've seen aren't even for IO, but for situations where `ST` doesn't quite fit (for example: concurrent algorithms that always return the same result given the same arguments, but which can't be proven to do so just using the type system) — Jeremy List, Jul 20 '15 at 01:36

How lazy evaluation forced Haskell to be pure

4 Answers4

Linked