64

Haskell has been called a "pure functional language."

What does "pure" mean in this context? What consequences does this have for a programmer?

nbro
  • 15,395
  • 32
  • 113
  • 196
Bobby S
  • 4,006
  • 9
  • 42
  • 61
  • 1
    The consequence is that you can't use other programming paradigms -- e.g. object-oriented. Also, this must be homework? – Rafe Kettler Dec 07 '10 at 22:25
  • 8
    @Rafe: No, you can (though there can be an impedance mismatch); it means that there aren't any side effects. – Antal Spector-Zabusky Dec 07 '10 at 22:30
  • @Antal I was talking about consequences rather than meaning. – Rafe Kettler Dec 07 '10 at 22:35
  • 12
    @Rafe Kettler: Lack of side-effects works just fine with OOP; in fact, the popularity of immutable objects is a step in precisely that direction. – C. A. McCann Dec 07 '10 at 22:38
  • See http://homepages.cwi.nl/~ralf/OOHaskell/ – Paul Johnson Dec 08 '10 at 06:39
  • @RafeKettler specifically [Haskell can't do virtual inheritance](http://stackoverflow.com/a/8409828/615784), but this is due to its typeclass, not due to it being [pure. Declarative](http://stackoverflow.com/a/8357604/615784) languages can have virtual inheritance (I am working on one). – Shelby Moore III Dec 09 '11 at 20:08

7 Answers7

64

In a pure functional language, you can't do anything that has a side effect.

A side effect would mean that evaluating an expression changes some internal state that would later cause evaluating the same expression to have a different result. In a pure functional language you can evaluate the same expression as often as you want with the same arguments, and it would always return the same value, because there is no state to change.

For example, a pure functional language cannot have an assignment operator or do input/output, although for practical purposes, even pure functional languages often call impure libraries to do I/O.

Joel Spolsky
  • 33,372
  • 17
  • 89
  • 105
  • 13
    I/O is still possible in a purely functional language - you can treat the input stream as a list of characters, and similarly, you can return a (possibly lazy-evaluated) list of characters to be written as output. – Anon. Dec 07 '10 at 22:27
  • 3
    In fact, early versions of Haskell worked exactly like that, though generalized to allow for more I/O actions than just reading and writing a single stream of characters. – ephemient Dec 07 '10 at 22:42
  • I would rather see an explanation in terms of what a pure function guarantees, rather than a loose account of what it "can't do". – luqui Dec 08 '10 at 02:31
  • 23
    A functional language that merely calls impure libraries to do I/O isn't pure. Haskell is pure, and it can do I/O. The trick is to wrap up the I/O "actions" as immutable values. So "getChar" is a constant value of type "IO Char", and IO values can be composed into larger operations. The language has a notional interpreter that processes the IO value returned by the "main" function. – Paul Johnson Dec 08 '10 at 06:38
  • 19
    This is a good answer, but I'm not sure it's the best one. As such, I'm surprised that it has the highest number of votes by far. Speaking from past experience, many people are confused about how a language like Haskell is claimed to be a pure functional programming language, and yet can still do something useful (e.g. I/O). This answer does nothing to clarify that. – Daniel Pratt Dec 08 '10 at 12:21
  • @luqui: A pure function guarantees that it won't do anything that pure functions can't do (e.g. it won't modify a global variable behind your back). Therefore when you're reading a program, knowing that a function is pure makes it a lot easier to understand what the program is doing. – Gilles 'SO- stop being evil' Dec 08 '10 at 21:40
  • 1
    @Daniel Pratt: There is a vast array of confusing topics outside the purview of the very basic question "What is a 'pure functional language'?" That is one of them. Focusing on Haskell's hand-wavy IO in answer to that question just leads to further confusion. That would be a good thing to discuss if the question were, "How does Haskell do IO while remaining a pure functional language?" – Chuck Dec 09 '10 at 17:54
  • 2
    I prefer, "In a ''pure'' functional language, your function can't have side effects without telling people that it has them." It's technically untrue, thanks to the like of `seq` and `unsafePerformIO`, but it's the general feel. Better still is: "In a ''pure'' functional language, you can replace any expression with the results of that expression, and not change how your program works." It has the same caveats, but it expresses the spirit of the usefulness of "purity" more positively. – BMeph Dec 28 '10 at 17:00
  • 9
    On another note, despite the super-star level and deserved humongous reputation of Mr. Spolsky, I think he got a crucial detail wrong here, and that those up-voting his answer did so because he wrote it, not because it's right. – BMeph Dec 28 '10 at 17:01
  • A pure expression can also include the assignment operator, but it takes a left operand that has the [bottom type](http://stackoverflow.com/a/8450398/615784) (i.e. no value), e.g. `val x = 1`, and the type of that expression is the bottom type. In other words, the pure assignment operator associates an uninitialized identifier to a value. – Shelby Moore III Dec 09 '11 at 20:28
  • 1
    @PaulJohnson Haskell's `IO` monad is [most definitely *impure*](http://goldwetrust.up-with.com/t112p180-computers#4434), w.r.t. the external `World`. [Monadic composition is pure](http://stackoverflow.com/a/8357604/615784) (except in this case where Haskell's compiler is "lying"). – Shelby Moore III Dec 09 '11 at 21:01
  • @DanielPratt A pure language can never do I/O. A pure way to do I/O, is functional reactive programming (i.e. call me, I won't call you). But that requires imperative code to call the _pure_ FRP functions with varying values over time. Haskell puts this imperative code "external to main()", but that means every Haskell program has some imperative code in it. It is [impossible to avoid the imperative world](http://stackoverflow.com/q/8409822/615784) (read [my comments to michid](http://stackoverflow.com/q/8409822/615784) and also [this link](http://stackoverflow.com/a/8352969)). – Shelby Moore III Dec 09 '11 at 21:13
  • 1
    @BMeph You are correct that there can be "side-effects" in a pure (i.e. RT) expression, e.g. monadic sequence order. The definition of "effects" is ambiguous in this context. However, a pure expression is always RT, which has a very precise definition. [Monadic composition is always RT](http://stackoverflow.com/a/8357604/615784) (except where Haskell is lying about `IO`). – Shelby Moore III Dec 09 '11 at 21:18
  • Where I wrote "bottom type" above, some claim this is the unit type, to which I agree, except to the extent that non-termination can't be formally proven, then perhaps it is the bottom type. Also the non-RT (i.e. imperative) code for `IO` is "external to main()", thus the `IO` use within the main problem is RT (i.e. pure), but the program overall is impure, because it contains impure (i.e. non-RT) code. There exist 3 words to say the same thing, imperative, non-RT, impure. Also 3 words to say the opposite, declarative, RT, pure. – Shelby Moore III Dec 10 '11 at 22:58
37

"Pure" and "functional" are two separate concepts, although one is not very useful without the other.

A pure expression is idempotent: it can be evaluated any number of times, with identical results each time. This means the expression cannot have any observable side-effects. For example, if a function mutated its arguments, set a variable somewhere, or changed behavior based on something other than its input alone, then that function call is not pure.

A functional programming language is one in which functions are first-class. In other words, you can manipulate functions with exactly the same ease with which you can manipulate all other first-class values. For example, being able to use a "function returning bool" as a "data structure representing a set" would be easy in a functional programming language.

Programming in functional programming languages is often done in a mostly-pure style, and it is difficult to be strictly pure without higher-order function manipulation enabled by functional programming languages.

Haskell is a functional programming language, in which (almost) all expressions are pure; thus, Haskell is a purely functional programming language.

ephemient
  • 198,619
  • 38
  • 280
  • 391
  • 3
    Excellent. You are one of the few I have found who understands [not to conflate](http://stackoverflow.com/a/8357604/615784) _pure_ (i.e. RT) and FP. – Shelby Moore III Dec 09 '11 at 21:29
  • "Pure" is just a modifier. You can just as easily have a pure logic programming language as a pure functional language. You should probably have said "Functional and pure functional are two separate concepts". – Hugh Allen Sep 04 '12 at 03:51
  • 1
    f is idempotent => f . f = f. You're statement equivocates the term idempotent with referential transparency. So even though you've gotten the semantics right, you really shouldn't be using the term idempotent that way. – RussellStewart Jul 13 '14 at 00:26
  • 2
    I think the statement "A pure expression is idempotent" is wrong. `f(x): x + 1` is a simple increment function and is "Pure" because it does not have any side effects or state. Whereas, it is not idempotent because `f(f(x)) != f(x)`. Whereas a function like `g(x): x - x` is both pure and idempotent. Because it always returns 0. – Kartik Sreenivasan May 16 '17 at 10:09
24

A pure function is one which has no side effects — it takes a value in and gives a value back. There's no global state that functions modify. A pure functional language is one which forces functions to be pure. Purity has a number of interesting consequences, such as the fact that evaluation can be lazy — since a function call has no purpose but to return a value, then you don't need to actually execute the function if you aren't going to use its value. Thanks to this, things like recursive functions on infinite lists are common in Haskell.

Another consequence is that it doesn't matter in what order functions are evaluated — since they can't affect each other, you can do them in any order that's convenient. This means that some of the problems posed by parallel programming simply don't exist, since there isn't a "wrong" or "right" order for functions to execute.

Chuck
  • 234,037
  • 30
  • 302
  • 389
15

Strictly speaking, a pure functional language is a functional language (i.e. a language where functions are first-class values) where expressions have no side effects. The term “purely functional language” is synonymous.

By this definition, Haskell is not a pure functional language. Any language in which you can write programs that display their result, read and write files, have a GUI, and so on, is not purely functional. Thus no general purpose programming language is purely functional (but there are useful domain-specific purely functional languages: they can typically be seen as embedded languages in some way).

There is a useful relaxed sense in which languages like Haskell and Erlang can be considered purely functional, but languages like ML and Scheme cannot. A language can be considered purely functional if there is a reasonably large, useful and well-characterised subset where side effects are impossible. For example, in Haskell, all programs whose type is not built from IO or other effect-denoting monad are side-effect-free. In Erlang, all programs that don't use IO-inducing libraries or concurrency features are side-effect-free (this is more of a stretch than the Haskell case). Conversely, in ML or Scheme, a side effect can be buried in any function.

By this perspective, the purely functional subset of Haskell can be seen as the embedded language to deal with the behavior inside each monad (of course this is an odd perspective, as almost all the computation is happening in this “embedded” subset), and the purely functional subset of Erlang can be seen as the embedded language do deal with local behavior.


Graham Hutton has a slightly different, and quite interesting, perspective on the topic of purely functional languages:

Sometimes, the term “purely functional” is also used in a broader sense to mean languages that might incorporate computational effects, but without altering the notion of ‘function’ (as evidenced by the fact that the essential properties of functions are preserved.) Typically, the evaluation of an expression can yield a ‘task’, which is then executed separately to cause computational effects. The evaluation and execution phases are separated in such a way that the evaluation phase does not compromise the standard properties of expressions and functions. The input/output mechanisms of Haskell, for example, are of this kind.

I.e. in Haskell, a function has the type a -> b and can't have side effects. An expression of type IO (a -> b) can have side effects, but it's not a function. Thus in Haskell functions must be pure, hence Haskell is purely functional.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
  • 2
    An IO function has the slightly different type `a -> IO b`, and indeed is a function which just returns an IO wrapped value. – fuz Dec 09 '10 at 01:21
  • 1
    @FUZxxl But that doesn't make `IO` pure (the compiler is lying about the purity). See my comments and links under Joel Spolsky's answer. – Shelby Moore III Dec 09 '11 at 21:37
  • @ShelbyMooreIII No one here is claiming that `IO` is pure (that would be rather silly). The compiler isn't lying, except insofar as you consider nonterminating programs impure. – Gilles 'SO- stop being evil' Dec 09 '11 at 22:33
  • @Gilles "No one"? I see 5 upvotes on Paul Johnson's comment under Joel Spolsky's answer, where Paul claims `IO` in Haskell is pure. Why are you mentioning "nonterminating"? Are you conflating the [other page](http://stackoverflow.com/a/8450398/615784) where I was writing about a bottom type? Seems you are confusing something that has nothing to do with the logic I am presenting about the `IO` monad in Haskell. Please explain. – Shelby Moore III Dec 10 '11 at 00:43
  • 1
    @ShelbyMooreIII You seem to be confusing computations of values of type `IO a` (the “immutable values” that Paul refers to), which are pure, with the execution of such values, which is effectful (Paul's “notional interpreter”). I mention nonterminating because strictly speaking, nonterminating are not referentially transparent, since the program has a different behavior if you evaluate them at least once or not at all. This has nothing to do with a bottom type; it is very much related with including a bottom value in every type. – Gilles 'SO- stop being evil' Dec 10 '11 at 01:11
  • @Gilles Execution has nothing to do with it. The imperative code for `IO` is included in the program. The program contains imperative code, therefor the program is not 100% pure. Haskell hides it, but it is still included in the program before it executes. Just the same as if you used Scala and wrote some portions purely and other portions without RT. I understand very well that every Haskell inductive type is populated by the bottom value, [because in lazy languages bottom is a value, not an effect](http://existentialtype.wordpress.com/2011/04/24/the-real-point-of-laziness/#comment-798). – Shelby Moore III Dec 10 '11 at 01:31
4

As there cannot be any side effects in pure functional code, testing gets much easier as there is no external state to check or verify. Also, because of this, extending code may become easier.

I lost count of the number of times I had trouble with non-obvious side effects when extending/fixing (or trying to fix) code.

Marcus Borkenhagen
  • 6,536
  • 1
  • 30
  • 33
4

As others have mentioned, the term "pure" in "pure functional programming language" refers to the lack of observable side-effects. For me, this leads straight to the question:

What is a side-effect?

I have seen side-effects explained both as

  • something that a function does other than simply compute its result
  • something that can affect the result of a function other than the inputs to the function.

If the first definition is the correct one, then any function that does I/O (e.g. writing to a file) cannot be said to be a "pure" function. Whereas Haskell programs can call functions which cause I/O to be performed, it would seem that Haskell is not a pure functional programming language (as it is claimed to be) according to this definition.

For this and other reasons, I think the second definition is the more useful one. According to the second definition, Haskell can still claim to be a completely pure functional programming language because functions that cause I/O to be performed compute results based only on function inputs. How Haskell reconciles these seemingly conflicting requirements is quite interesting, I think, but I'll resist the temptation to stray any further from answering the actual question.

Daniel Pratt
  • 12,007
  • 2
  • 44
  • 61
  • 4
    You're misunderstanding how Haskell does I/O. In early prototypes, `main` was a pure function of a lazy stream of I/O replies from the world into a lazy stream of I/O requests to the world. Modern Haskell uses monads to make this easier to program, but the concept is still the same: a pure value *describing* a transformation to be applied to the outside, impure world. – ephemient Dec 08 '10 at 18:32
  • Perhaps I've explained it poorly, but I believe I do understand how Haskell does I/O (conceptually). For this reason I said "...functions which cause I/O to be performed..." rather than "...functions which perform I/O...". As a matter of implementation, I believe the GHC compiler causes I/O actions to be performed concurrently with a program, but that is an implementation detail of course. – Daniel Pratt Dec 08 '10 at 19:09
  • 3
    Haskell doesn't have functions which perform I/O. It has values (and functions that return values) that, when sequenced in the IO Monad in main, result in IO being performed by the runtime. – ephemient Dec 08 '10 at 19:39
  • I'm confused with your response. I stated explicitly that I understood that Haskell functions don't themselves perform I/O and your response seems to state the very same. Is there anything in what I've said that directly contradicts this? By the way, we've both ignored the fact that Haskell does, in fact, have functions that *will* perform I/O directly, though their use is not recommended except for special circumstances. – Daniel Pratt Dec 08 '10 at 20:01
  • 2
    Yeah, let's all ignore unsafe functions. My gripe is that, within the large safe parts of Haskell, "something that a function does other than simply compute its result" does not exist. Sure, if you include the execution of the runtime then *some* side-effects may happen, but you could replace the whole `IO` monad with `ST`, and your program would be pure and not know the difference. – ephemient Dec 08 '10 at 20:59
  • 3
    @Daniel: +1 for highlighting that it's a tricky point, but “how Haskell reconciles these seemingly conflicting requirements” is at the heart of the question. Also your second definition of side effects is flawed — it doesn't consider printing to be a side effect, for instance. The first definition is correct (within the limits of its simplification). – Gilles 'SO- stop being evil' Dec 08 '10 at 21:37
  • @ephemient In fact your program would know the difference if you replaced `IO` with `ST` monad in Haskell. See my comments under Joel Spolsky's answer, and the links in them. Here is quote from one of the linked [documents I wrote](http://goldwetrust.up-with.com/t112p180-computers#4434), "The World state is not reversible in time, because it is impossible to cache the infinite states of the World for each discrete time (besides time is continuous which is another reason there are infinite states)...". Please read all the points there. – Shelby Moore III Dec 09 '11 at 21:58
  • @Gilles It is not tricky at all. Haskell's type checker is pretending that it knows the internal structure of the `World`, but of course it can't know that, because it is a coinductive type. – Shelby Moore III Dec 09 '11 at 22:00
  • 1
    @ShelbyMooreIII Sure, it is a tricky point (one that you don't seem to fully grasp, either). Haskell isn't pretending to know the internal structure of the world; on the contrary, it goes through pains to know as little as possible about it. By the way, time is almost never continuous in models of computation. – Gilles 'SO- stop being evil' Dec 09 '11 at 22:32
  • @Gilles The [definition of referential transparency](http://stackoverflow.com/a/8346779) (RT), requires the compiler to know the computable structure of the RT functions. Since it can't "count" the `World`, it can't guarantee RT. Make a pure function that reads `IO` and writes to the screen. Call the function 2x, and observe that you are not guaranteed to get the same output. Did you read [the link](http://stackoverflow.com/a/8357604/615784) I gave? I wrote, "imperative w.r.t. the World state external to the program (but in the sense of the caveat below, the side-effects are isolated)". – Shelby Moore III Dec 09 '11 at 23:55
  • @Gilles The program is imperative, because the interaction with the `World` is the state machine of the program. But this is isolated to the "external to main()" imperative code. Nevertheless, that code is part of the executable program, which makes the Haskell program a mix of pure and impure code. It simply can't be any other way. No 100% pure program can interact with the **COINDUCTIVE** reality of our universe. Do you know what a coinductive type is? – Shelby Moore III Dec 09 '11 at 23:59
4

Amr Sabry wrote a paper about what a pure functional language is. Haskell is by this definition considered pure, if we ignore things like unsafePerformIO. Using this definition also makes ML and Erlang impure. There are subsets of most languages that qualify as pure, but personally I don't think it's very useful to talk about C being a pure language.

Higher-orderness is orthogonal to purity, you can design a pure first-order functional language.

Anon'
  • 41
  • 1
  • 2
    Not that your answer isn't right, but I doubt the SO crowd will be happy to see a "read this paper, and you will understand X" answer. Just saying... :) – BMeph Dec 28 '10 at 17:03