26

I don't understand the role played by bottom ( or _|_) in Haskell function definitions.

The definition of zip for example describes it as "right lazy" because

zip [] _|_ = []

but I'm unclear how this differs from

zip [] _ = []

What role is _|_ playing in function definitions such as the one above? In particular, how is it different from using _?


UPDATE AND NOTE: As readers of the excellent answers will discover for themselves, a crucial part of those answers, worth pulling up here, is that does not (and cannot), in fact, appear in Haskell function definitions. Read on.

Community
  • 1
  • 1
orome
  • 45,163
  • 57
  • 202
  • 418
  • 1
    Related: [The concept of bottom in Haskell](http://stackoverflow.com/questions/6379458/the-concept-of-bottom-in-haskell) – dfeuer Sep 11 '15 at 02:00
  • 1
    Also related: [What does this syntax mean in Haskell: ⊥ or (⊥)](http://stackoverflow.com/questions/19794681/what-does-this-syntax-mean-in-haskell-or/19794743#19794743) – dfeuer Sep 11 '15 at 02:01

4 Answers4

39

Bottom is essentially the fancy algebraic way of saying undefined.

If you try this, you can see why zip is lazy for its right-hand argument:

λ> zip [] undefined
[]
λ> zip undefined []
*** Exception: Prelude.undefined

This is because undefined only fails when you try to evaluate it.

You might be confusing _|_ with _ because of the way it was presented. I will make it clear: the line zip [] _|_ = [] does not act as a pattern match but an equation, stating the equality of zip [] _|_ and []. That is to say, this is not valid Haskell code, but a notational, abstract-algebraic way of saying "I don't care about the second argument."

In the definition of zip you may of course use _, but that's irrelevant. You could have used any name, just as long as it wasn't a constructor-matching pattern such as (Just x) or (a,b). Values will remain unevaluated until they must be pattern matched in pure code.

You can read more about lazy evaluation here.

You can read more about bottom here and here.

AJF
  • 11,767
  • 2
  • 37
  • 64
  • What would happen if `_` were used instead? It seems that that would accomplish "right lazy" (or at least my understanding of it: if the left arg is `[]` don't bother with the right)? – orome Sep 10 '15 at 15:37
  • 1
    @raxacoricofallapatorius Fixed. – AJF Sep 10 '15 at 15:37
  • Ah, I see; I was confusing it with a pattern match. Still, if the definition had had `zip [] _ = []`, wouldn't that do the same thing? And if it has, say just `zip x y = ...` then somewhere in the code it needs to say `zip [] _|_ = []`, right? – orome Sep 10 '15 at 15:44
  • @raxacoricofallapatorius It would do the same thing no matter what you put there, whether it be `_`, `qpootlefive` or `foo`. As I said, a value in a pure function will remain unevaluated unless pattern matched. – AJF Sep 10 '15 at 15:46
  • 1
    So the key thing is that somewhere in the definition of `zip`, as a consequence of the way the code for `zip` is written, the match for the left argument to `[]` ends things with a result of `[]` (before the right argument is evaluated). Correct? (And the docs are just describing that consequence.) – orome Sep 10 '15 at 15:54
  • 1
    @raxacoricofallapatorius precisely. There is no reference to the second argument in the result, so it's just dropped. – AJF Sep 10 '15 at 15:58
  • 1
    And one way to accomplish that would be to pattern match `zip [] _ = []` (before any other `zip [] y` match that would have triggered evaluation of `y`). Right? – orome Sep 10 '15 at 16:19
  • 2
    @raxacoricofallapatorius Note that `_|_` is simply an ASCII equivalent of `⊥`. This is a mathematical symbol. In the case of Haskell values the bottom is like an infinite computation or an exception (which are basically treated in the same way). It's a value of type `a` (i.e. any type). This is in contrast with a value of a concrete type. – Bakuriu Sep 10 '15 at 18:55
8

I think the OP already realises this, but for the benefit of others who come here with the same confusion: zip [] _|_ = [] is not actual code!

The symbol _|_ (which is just an ascii-art rendering of the mathematical symbol ) means bottom1, but only when we're talking about Haskell. In Haskell code it does not have this meaning2.

The line zip [] _|_ = [] is a description of a property of the actual code for zip; that if you call it with first argument [] and pass any bottom value as the second argument, the result is equal to []. The reason they would want to say exactly this is because the technical definition of what it means for a function f to be non-strict is when f ⊥ is not .

But there is no role of _|_ (or , or undefined, or the concept of bottom at all) in defining Haskell functions (in code). It has to be impossible to pattern match on an argument to see whether it is , for a number of reasons, and so there is no actual symbol for in Haskell code3. zip [] _|_ = [] is documentation of a property that is a consequence of the definition of zip, not part of its definition.

As a description of this property zip [] _ = [] is a less specific claim; it would be saying that whatever you call zip [] on, it returns []. It amounts to exactly the same thing, since the only way zip [] ⊥ can return something non-bottom is if it never examines its second argument at all. But it's speaking less immediately to the definition of non-strict-ness.

As code forming part of the definition of the function zip [] _ = [] can't be compared and contrasted to zip [] _|_ = []. They're not alternatives, the first is valid code, and the second is not.


1 Which is the "value" of an expression that runs forever, throws an exception, or otherwise falls to evaluate to a normal value.

2 It's not even a valid Haskell identifier, since it contains both "namey" characters (_) and "operator" characters (|). So it can't actually be a symbol meaning anything at all in Haskell code!

3 undefined is often used for , but it's more of a variable referring to a value than the actual thing itself. Much like if you have let xs = [1, 2, 3] you can use xs to refer to the list [1, 2, 3], but you can't use it as a pattern to match some other list against; the attempted pattern match would just be treated as introducing a new variable named undefined or xs shadowing the old one.

Ben
  • 68,572
  • 20
  • 126
  • 174
6

Riffing on AJFarmar's answer, I think this critical point was not made explicit:

  • _|_ is not a valid literal or identifier in Haskell code!
  • And therefore, zip [] _|_ = [] isn't valid code either!

That is implicit in what AJFarmar means by this quote:

[T]he line zip [] _|_ = [] does not act as a pattern match but an equation, stating the equality of zip [] _|_ and [].

To make it very crystal clear, zip [] _|_ = [] appears in the documentation comment for the definition of zip. It's not Haskell code—it's an English-language comment written in an informal technical notation that looks a little bit like Haskell code. Or, in other words, pseudo-code.

Luis Casillas
  • 29,802
  • 7
  • 49
  • 102
  • What does `_bs` mean in the code? Is it the same as `_`? And does the pattern `zip [] _bs = ...` ensure that the second argument is not ever evaluated if the first is `[]` (and there's no mention of the second on the rhs)? – orome Sep 11 '15 at 01:22
  • 1
    @raxacoricofallapatorius, in standard Haskell, the *only* difference between a pattern like `x` and the special pattern `_` is that `_` can never appear on the right-hand side. You can write `f x = x` but not `f _ = _`. In GHC, there's a warning available, enabled by `-Wall` among other things, that warns about unused bindings. In GHC, binding a name starting with an underscore suppresses that warning. So you can write `f _x = 3` to suggest that what's being passed in is an `x`, but tell both the compiler and human readers that you really don't intend to use it. – dfeuer Sep 11 '15 at 01:32
  • 1
    @raxacoricofallapatorius, in particular, binding a variable does not, by itself, force any evaluation. The only ways to force evaluation are pattern matching, `seq`, and `evaluate`, functions that call those, `if-then-else` (forces the condition) and the various numeric operations. (And, with a GHC extension, bang patterns) – dfeuer Sep 11 '15 at 01:39
  • So if I don't refer to it on the rhs, it won't get evaluated. Correct? And why, btw, point out `⊥` in particular, when `zip [] b` will be `[]` for *any* `b` and will leave `b` unevaluated. Does the case of`⊥` satisfy some important mathematical property (that another ignored/unevaluated value wouldn't)? – orome Sep 11 '15 at 01:43
  • @raxacoricofallapatorius, please see my answer for some explanation of where the whole ⊥ thing comes from. It is correct that if you don't use it, it won't be evaluated. The primary driver of evaluation is *pattern matching*. `f x = let z = x + 3 in 7` will not evaluate `x` because `x` is only used to define `z`, and nobody ever looks at `z`. Change that to `f x = let z = x + 3 in if z > 4 then 12 else 13`, and suddenly `f` is strict in `x`. – dfeuer Sep 11 '15 at 01:50
  • 1
    @raxacoricofallapatorius: `_bs` is just a variable, but with one trick up its sleeve: GHC's `warn-unused-binds` feature, which warns of unused variables, skips variables that start with underscore. So what's happening there is that the code in question is compiled with that option, and the use of that name shuts off the warning. Why use that instead of just `_`? I don't know the author's motive, but it's almost certainly a stylistic thing. – Luis Casillas Sep 11 '15 at 19:21
6

⊥ comes out of mathematical order theory. A partially ordered collection has a bottom element, denoted ⊥, if that element precedes every other element. How does this get into Haskell documentation? At some point, computer scientists realized that it would be useful to think about what a computer program, in whatever language, "means". One approach to that is called denotational semantics. In denotational semantics, each term in the programming language is assigned a "denotation", or meaning, in some universe of mathematical meanings. It would be wonderful to be able to say, for instance, that

  1. meaningInteger :: Integer -> mathematical integer
  2. meaningList :: [a] -> possibly-infinite sequence of elements of type a

Unfortunately, this doesn't quite work out in Haskell, because, for instance, I can write

oops :: Integer
oops = oops

This gives me a term of type Integer, but there's no sensible way to assign it a meaning as a mathematical integer. More interestingly, I could write things like

undefined
undefined : undefined
3 : undefined
[undefined]
let foo = undefined : 3 : undefined : foo

These all (can) have the same type, but have various different levels of undefinedness. So we need to add to our collection of meanings various sorts of undefined things. It's possible, however, to impose a partial order on them based on how defined they are! For example, 3 : 4 : [] is more defined than 3 : 4 : undefined, and is also more defined than 3 : undefined : 4, but the latter two are not comparable. The bottom element of each type, its very least defined element, is called ⊥.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • How can there be more than two levels of definedness (i.e. defined and undefined)? – Praxeolitic Dec 25 '17 at 14:24
  • @Praxeolitic, I have tried to explain. Can you please be more specific about what you don't understand? – dfeuer Dec 25 '17 at 16:54
  • After the first list of examples of undefined entities, you say they have various levels of undefined, which I didn't understand. Isn't something just defined or undefined? Then "For example, `3 : 4 : []` is more defined than `3 : 4 : undefined`, and is also more defined than `3 : undefined : 4`, but the latter two are not comparable." I interpreted to mean `3 : 4 : undefined` and `3 : undefined : 4` are both undefined and neither is more undefined than the other because there are no distinctions within "undefined", which obviously contradicts my understanding of the previous example. – Praxeolitic Dec 25 '17 at 17:38
  • I do see why the first list of items very vaguely feel like they have various levels of (un)definedness -- I'm not trying to be difficult -- but I'd like to know if there's more to this since the answer says this comes from mathematical theory. – Praxeolitic Dec 25 '17 at 17:46
  • 1
    @Praxeolitic, `undefined` is less defined than either of those partially defined values. Take two structures and line them up against each other. If a defined part of one is different from a defined part of the other, then they are not comparable. Otherwise, if the undefined portions of one lie within the undefined portions of the other, then the first one is more defined. If neither lies within the other, then they are again not comparable. – dfeuer Dec 25 '17 at 23:50
  • `3 : undefined : 4` is not well-typed. To save your example you can replace it with `3 : undefined : []`. (`3 : undefined : 4 : []` is incomparable to `3 : 4 : []` and `3 : undefined` is less defined than `3 : 4 : undefined`.) – peter pun Jun 21 '21 at 17:09
  • Also a better definition of the partial order "less defined" could perhaps be: `x` is less defined than `y` iff `x` is `_|_` or is atomic and equal to `y` or can be lined up against `y` and all of its components are less defined than their corresponding component in `y`. Of course things like "atomic", "components" and "lining up" need definition (which doesn't seem too hard to formulate). – peter pun Jun 21 '21 at 17:11