17

I'm trying to understand the Haskell 2010 Report section 3.17.2 "Informal Semantics of Pattern Matching". Most of it, relating to a pattern match succeeding or failing seems straightforward, however I'm having difficulty understanding the case which is described as the pattern match "diverging".

I'm semi-persuaded it means that the match algorithm does not "converge" to an answer (hence the match function never returns). But if doesn't return, then, how can it return a value, as suggested by the parenthetical "i.e. return "? And what does it mean to "return " anyway? How one handle that outcome?

Item 5 has the particularly confusing (to me) point "If the value is , the match diverges". Is this just saying that a value of produces a match result of ? (Setting aside that I don't know what that outcome means!)

Any illumination, possibly with an example, would be appreciated!


Addendum after a couple of lengthy answers: Thanks Tikhon and all for your efforts.

It seems my confusion comes from there being two different realms of explanation: The realm of Haskell features and behaviors, and the realm of mathematics/semantics, and in Haskell literature these two are intermingled in an attempt to explain the former in terms of the latter, without sufficient signposts (to me) as to which elements belong to which.

Evidently "bottom" is in the semantics domain, and does not exist as a value within Haskell (ie: you can't type it in, you never get a result that prints out as " ").

So, where the explanation says a function "returns ", this refers to a function that does any of a number of inconvenient things, like not terminate, throw an exception, or return "undefined". Is that right?

Further, those who commented that actually is a value that can be passed around, are really thinking of bindings to ordinary functions that haven't yet actually been called upon to evaluate ("unexploded bombs" so to speak) and might never, due to laziness, right?

gwideman
  • 2,705
  • 1
  • 24
  • 43
  • Returning undefined is just like throwing an exception--that's how `undefined` is defined. And yes, passing bottom around is possible only because Haskell is not strict, and those values are indeed something like unexploded bombs. – Tikhon Jelvis Feb 05 '13 at 04:14
  • But nothing in the program, in Haskell, actually _knows_ (or could test) that a value "is" a bottom, until/unless it's actually called and has its effect, at which point it's an actual exception, undefined or non-termination, right? (Still trying to nail down whether "bottom" is an actual value in Haskell, as opposed to a semantic concept, useful for thinking but not an actual value). – gwideman Feb 05 '13 at 04:24
  • Yeah, you can't test for it. Consider the non-termination case: how do you know that evaluation won't terminate? It's undecidable in general. If you have an infinite loop on your hands, you don't really have a normal value but you do have `⊥`. I hope that clarifies things--as the comments on my answer show, I couldn't think of a good way to describe it. – Tikhon Jelvis Feb 05 '13 at 04:30
  • OK, thanks. I think for my brain this means that bottom is not an actual Haskell value or type. It's a word/symbol by which to talk about functions that might return undefined, might not return, or might exception. As for the comments on your answer, they were helpful in their own right in demonstrating the varied ways this is discussed and thought about. Thanks again. – gwideman Feb 05 '13 at 04:38
  • A really important way of seeing the "non-testability" (in Haskell) of `⊥` is to understand monotonicity. There's a long way of seeing it (which explains the CPO domain of Haskell semantics) or a short way: the full semantics of `⊥` are that you can't pattern match on it. Any "final" constructor *could* match it (False, True, 3, []), but as soon as you do the semantics of the match become infected to also be just `⊥`. So any function that matches on an infinite loop IS an infinite loop. Same with `undefined` and almost the same with errors (note that libraries like `spoon` violate this). – J. Abrahamson Feb 05 '13 at 15:13

2 Answers2

19

The value is ⊥, usually pronounced "bottom". It is a value in the semantic sense--it is not a normal Haskell value per se. It represents computations that do not produce a normal Haskell value: exceptions and infinite loops, for example.

Semantics is about defining the "meaning" of a program. In Haskell, we usually talk about denotational semantics, where the value is a mathematical object of some sort. The most trivial example would be that the expression 10 (but also the expression 9 + 1) have denotations of the number 10 (rather than the Haskell value 10). We usually write that ⟦9 + 1⟧ = 10 meaning that the denotation of the Haskell expression 9 + 1 is the number 10.

However, what do we do with an expression like let x = x in x? There is no Haskell value for this expression. If you tried to evaluate it, it would simply never finish. Moreover, it is not obvious what mathematical object this corresponds to. However, in order to reason about programs, we need to give some denotation for it. So, essentially, we just make up a value for all these computations, and we call the value ⊥ (bottom).

So ⊥ is just a way to define what a computation that doesn't return "means".

We also define other computations like undefined and error "some message" as because they also do not have obvious normal values. So throwing an exception corresponds to . This is exactly what happens with a failed pattern match.

The usual way of thinking about this is that every Haskell type is "lifted"--it contains . That is, Bool corresponds to {⊥, True, False} rather than just {True, False}. This represents the fact that Haskell programs are not guaranteed to terminate and can have exceptions. This is also true when you define your own type--the type contains every value you defined for it as well as .

Interestingly, since Haskell is non-strict, can exist in normal code. So you could have a value like Just ⊥, and if you never evaluate it, everything will work fine. A good example of this is const: const 1 ⊥ evaluates to 1. This works for failed pattern matches as well:

const 1 (let Just x = Nothing in x) -- 1

You should read the section on denotational semantics in the Haskell WikiBook. It's a very approachable introduction to the subject, which I personally find very fascinating.

Tikhon Jelvis
  • 67,485
  • 18
  • 177
  • 214
  • 3
    To expand on this answer, note that in haskell, we, informally speaking, _can_ return `_|_`, because we are in a non-strict setting. This is to say that even if a value has the denotation `_|_` we can still pass it around and soforth, with the caveat that if we poke it, then the expression which pokes it _also_ takes on the denotation `_|_`. – sclv Feb 05 '13 at 01:07
  • I would think that `⊥` is very much a real Haskell value, since it can be passed around and has a representation in memory, it can be evaluated, etc. – Dietrich Epp Feb 05 '13 at 01:07
  • @DietrichEpp: It's a little bit confusing: while `⊥` is a value in a certain sense, you can never actually get a `⊥` to hold--there is no pattern of bits that represents `⊥`, unlike other values like numbers. Can you think of a better way to describe it? – Tikhon Jelvis Feb 05 '13 at 01:09
  • There are many patterns of bits that represent `⊥`. You can store `⊥` in memory. – Dietrich Epp Feb 05 '13 at 01:10
  • I think the problem here is that you haven't defined what "Haskell value" is. If you were to ask me what the members of `Bool` are, I would say that there are three: `True`, `False`, and `⊥`. Each of them is a member of `Bool` in Haskell. Saying that `⊥` isn't a value makes it difficult to reason about functions in Haskell -- `⊥` makes all functions total, and if `⊥` were not a value then `sqrt` would not be a function. – Dietrich Epp Feb 05 '13 at 01:23
  • 3
    There's also the confusing observable difference between different `⊥`s such as `undefined`, `error "undefined"` and `let x = x in x`. Each has a different technical Haskell value but they're merged in denotational semantics just because. – J. Abrahamson Feb 05 '13 at 01:46
  • 1
    Arguably, "Haskell" is a syntax and a mapping to a semantic domain which gives it meaning. `⊥` is a value in the semantic domain, just as 2 is. `let x = x in x` and `1 + 1` are both expressions. It so happens that `⊥`-the-value is a value with the property that causes expressions in the semantic domain that are strict in that argument to also be `⊥`. – sclv Feb 05 '13 at 02:07
  • Hi Tikhon and commenters -- very helpful. Please see the addendum to my question to see if I distilled this sensibly. – gwideman Feb 05 '13 at 04:07
  • 2
    @tel As the person running the program, you are able to observe a difference between `undefined`, `error` and nontermination. But from within the program there is no observable difference between them; they all represent a catastrophic event which cannot be recovered from. That's the sense in which they are semantically the same. – Tom Crockett Feb 05 '13 at 05:06
  • @pelotom That was exactly what I was getting at, though perhaps I wasn't very clear. And it often makes me wonder about libraries like `spoon`. – J. Abrahamson Feb 05 '13 at 15:15
11

Denotational semantics

So, briefly denotational semantics, which is where lives, is a mapping from Haskell values to some other space of values. You do this to give meaning to programs in a more formal manner than just talking about what programs should do—you say that they must respect their denotational semantics.

So for Haskell, you often think about how Haskell expressions denote mathematical values. You often see Strachey brackets ⟦·⟧ to denote the "semantic mapping" from Haskell to Math. Finally, we want our semantic brackets to be compatible with semantic operations. For instance

⟦x + y⟧ = ⟦x⟧ + ⟦y⟧

where on the left side + is the Haskell function (+) :: Num a => a -> a -> a and on the right side it's the binary operation in a commutative group. While is cool, because then we know that we can use the properties from the semantic map to know how our Haskell functions should work. To wit, let's write the commutative property "in Math"

  ⟦x⟧ + ⟦y⟧ == ⟦y⟧ + ⟦x⟧ 
= ⟦x + y⟧ == ⟦y + x⟧ 
= ⟦x + y == y + x⟧

where the third step also indicates that the Haskell (==) :: Eq a => a -> a -> a ought to have the properties of a mathematical equivalence relationship.


Well, except...

Anyway, that's all well and good until we remember that computers are finite things and Maths don't much care about that (unless you're using intuitionistic logic, and then you get Coq). So, we have to take note of places where our semantics don't follow Math quite right. Here are three examples

⟦undefined⟧         = ??
⟦error "undefined"⟧ = ??
⟦let x = x in x⟧    = ??

This is where comes into play. We just assert that so far as the denotational semantics of Haskell are concerned each of those examples might as well mean (the newly introduced Mathematical/semantic concept of) . What are the Mathematical properties of ? Well, this is where we start to really dive into what the semantic domain is and start talking about monotonicity of functions and CPOs and the like. Essentially, though, is a mathematical object which plays roughly the same game as non-termination does. To the point of view of the semantic model, is toxic and it infects expressions with its toxic-nondeterminism.

But it's not a Haskell-the-language concept, just a Semantic-domain-of-the-language-Haskell thing. In Haskell we have undefined, error and infinite looping. This is important.


Extra-semantic behavior (side note)

So the semantics of ⟦undefined⟧ = ⟦error "undefined"⟧ = ⟦let x = x in x⟧ = ⊥ are clear once we understand the mathematical meanings of , but it's also clear that those each have different effects "in reality". This is sort of like "undefined behavior" of C... it's behavior that's undefined so far as the semantic domain is concerned. You might call it semantically unobservable.


So how does pattern matching return ?

So what does it mean "semantically" to return ? Well, is a perfectly valid semantic value which has the infection property which models non-determinism (or asynchronous error throwing). From the semantic point of view it's a perfectly valid value which can be returned as is.

From the implementation point of view, you have a number of choices, each of which map to the same semantic value. undefined isn't quite right, nor is entering an infinite loop, so if you're going to pick a semantically undefined behavior you might as well pick one that's useful and throw an error

*** Exception: <interactive>:2:5-14: Non-exhaustive patterns in function cheers
Tikhon Jelvis
  • 67,485
  • 18
  • 177
  • 214
J. Abrahamson
  • 72,246
  • 9
  • 135
  • 180
  • tel: Thanks for your diligent effort here. It's helpful that you directed my attention to the extent to which Haskell explanations often entail relating Haskell behavior to some semantic domain. That said, my question is really about what happens in actual Haskell, not the semantic concepts that correspond to what happens. I think you were getting close in your final "how does pattern matching return ⊥" section, but I didn't really get what you were trying to say with the "you have a number of choices" etc. Where do I have these choices? – gwideman Feb 05 '13 at 04:17
  • You as a Haskell user don't have those choices, if you're implementing Haskell then you do. An implementation can choose to do whatever it likes to evaluate `⊥` at the top level. In fact, it may not be able to do anything besides hang if that `⊥` really represents an infinite loop. But things like pattern match failures are allowed to be more informative and this is where you get a "choice". – J. Abrahamson Feb 05 '13 at 15:26