6

My current understanding of pattern overlapping in Haskell is that 2 patterns are considered to be overlapping if some argument values passed to a function can be matched by multiple patterns.

Given:

last :: [a] -> a
last [x] = x
last (_ : xs) = last xs

passing the argument value [1] would match both the first pattern [x] and the 2nd pattern (_ : xs) - so that would mean the function has overlapping patterns even though both patterns can be matched.

What makes this confusing is that although the patterns are (by the definition above) overlapping, GHC does not show any warning about them being overlapping.

Reverting the 2 pattern matches in the last function does show the overlapping warning:

last :: [a] -> a
last (_ : xs) = last xs
last [x] = x

Warning:

src\OverlappingPatterns.hs:6:1: Warning:
    Pattern match(es) are overlapped
    In an equation for `last': last [x] = ...

It is almost as though GHC consideres the patterns overlapping if a previous pattern makes it impossible to match a pattern which occurs later.

What is the correct way to determine if a function has overlapping patterns or not?


Update

I am looking for the overlapping pattern definition used in fp101x course.

According to the definition used in fp101x the following function has overlapping patterns:

last :: [a] -> a
last [x] = x
last (_ : xs) = last xs

This is in contradiction with GHC definition of overlapping pattern which does not consider it to have any overlapping patterns.

Without a proper definition of what overlapping pattern means in the fp101x course context, it is impossible to solve that exercise. And the definition used there is not the GHC one.

Răzvan Flavius Panda
  • 21,730
  • 17
  • 111
  • 169
  • 1
    GHC warns about "pattern overlapping" when it actually detects "pattern inclusion" -- precisely when one case in the pattern becomes unreachable since it only covers values already dealt with in previous cases. – chi Dec 28 '14 at 16:27
  • @chi: that would be what GHC consideres to be `pattern overlapping`. The first function definition seems to be considered `pattern overlapping` also. Is there a formal definition of what `pattern overlapping` is? – Răzvan Flavius Panda Dec 28 '14 at 16:31
  • From your description, it seems like they don't want any sort of overlapping. Then you could do something like use the pattern `last _:x:xs = last (x:xs); last [x] = x` or use guards. – Rafael Almeida Dec 29 '14 at 12:12

4 Answers4

2

The updated question clarifies the OP wants a formal definition of overlapping patterns. Here "overlapping" is meant in the sense used by GHC when it emits its warnings: that is, when it detects that a case branch is unreachable because its pattern does not match with anything which is not already handled by earlier branch.

A possible formal definition can indeed follow that intuition. That is, for any pattern p one can first define the set of values (denotations) [[p]] matching with p. (For this, it is important to know the type of the variables involved in p -- [[p]] depends on a type environment Gamma.) Then, one can say that in the sequence of patterns

q0 q1 ... qn p

the pattern p is overlapping iff [[p]], as a set, is included in [[q0]] union ... union [[qn]].

The above definition is hardly operative, though -- it does not immediately lead to an algorithm for checking overlaps. Indeed, computing [[p]] is unfeasible since it is an infinite set, in general.

To define an algorithm, I'd try to define a representation for the set of terms "not yet matched" by any pattern q0 .. qn. As an example, suppose we work with lists of booleans:

Remaining: _   (that is, any list)

q0 = []
Remaining: _:_  (any non empty list)

q1 = (True:xs)
Remaining: False:_

p = (True:False:ys)
Remaining: False:_

Here, the "remaining" set did not change, so the last pattern is overlapping.

As another example:

Remaining: _

q0 = True:[]
Remaining: [] , False:_ , True:_:_

q1 = False:xs
Remaining: [], True:_:_

q2 = True:False:xs
Remaining: [], True:True:_

q3 = []
Remaining: True:True:_

p = True:xs
Remaining: nothing  -- not overlapping (and exhaustive as well!)

As you can see, in each step we match each of the "remaining" samples with the pattern at hand. This generates a new set of remaining samples (possibly none). The collection of all these samples forms the new remaining set.

For this, note that it is important to know the list of constructors for each type. This is because when matching with True, you must know there's another False case remaining. Similarly, if you match against [], there's another _:_ case remaining. Roughly, when matching against constructor K, all other constructors of the same type remain.

The above examples are not yet an algorithm, but they can get you started, hopefully.

All of this of course ignores case guards (which make the overlap undecidable), pattern guards, GADTs (which can further refine the remaining set in quite subtle ways).

chi
  • 111,837
  • 3
  • 133
  • 218
1

I am looking for the overlapping pattern definition used in fp101x course.

"Patterns that do not rely on the order in which they are matched are called disjoint or non-overlapping." (from "Programming in Haskell" Graham Hutton)

So this example would be non-overlapping

foldr :: (a → b → b) → b → [a] → b
foldr v [] = v
foldr f v (x : xs) = f x (foldr f v xs)

Because you can change the order of pattern-matching like this:

foldr :: (a → b → b) → b → [a] → b
foldr f v (x : xs) = f x (foldr f v xs)
foldr v [] = v

And here you can't:

last :: [a] -> a
last [x] = x
last (_ : xs) = last xs

So the last one )) is overlapping.

Andrew
  • 36,676
  • 11
  • 141
  • 113
0

I think the thing is that in the first case, not all matches of [x] will match (_:xs). On the second case, the converse is true (no one matching (_:xs) will fall through [x]). So, overlapping really means that there is an unreachable pattern.

This is what GHC documentation has to say about it:

By default, the compiler will warn you if a set of patterns are either incomplete (i.e., you're only matching on a subset of an algebraic data type's constructors), or overlapping, i.e.,

f :: String -> Int
f []     = 0 
f (_:xs) = 1 
f "2"    = 2

where the last pattern match in `f' won't ever be reached, as the second pattern overlaps it. More often than not, redundant patterns is a programmer mistake/error, so this option is enabled by default.

Maybe "unreachable pattern" would be a better choice of words.

Rafael Almeida
  • 2,377
  • 3
  • 22
  • 32
  • Do you know if there is any definition which can be used to determine exactly what `overlapping pattern` means? I'm pretty sure that both functions (regardless of order) are considered to have `overlapping patterns` but without a clear definition it is imposibile to determine. – Răzvan Flavius Panda Dec 28 '14 at 16:10
  • Will wait for better answers since this needs a clear definition that covers all the cases. – Răzvan Flavius Panda Dec 28 '14 at 16:28
  • Overlapping patterns are a compiler warning. They only come up when there's an unreachable pattern. I don't think there's a precise meaning for this anywhere. I'll add what GHC documentation says to my answer. I hope that further clarifies things. – Rafael Almeida Dec 28 '14 at 16:38
  • The problem is that the GHC meaning of `overlapping pattern` does not match the first function definition even though that is an `overlapping pattern` too. – Răzvan Flavius Panda Dec 28 '14 at 17:01
  • In the first definition last (\_ : xs) = last xs is still reachable (last [1,2], for instance). The warning is about a pattern being completely overlapped by a previous pattern. – Rafael Almeida Dec 28 '14 at 17:05
  • Yes, the problem is not wether it is reachable or not. The problem is that, the first function definition is considered an `overlapping pattern` too. And the GHC definition does not cover that. All matches of [x] are contained in the matches of (_ : xs) and that is overlapping... it is not exclusive. The only thing which makes that exclusive is the pattern matching order. – Răzvan Flavius Panda Dec 28 '14 at 17:08
  • Oh, ghc also warns on the first definition? That's likely a bug. What's your ghc version? Mine is 7.8.3 and it doesn't complain about the first definition. – Rafael Almeida Dec 28 '14 at 17:13
  • From the original question: "What makes this confusing is that although the patterns are (by the definition above) overlapping, GHC does not show any warning about them being overlapping." GHC does not say it is pattern overlapping, but it is. – Răzvan Flavius Panda Dec 28 '14 at 17:17
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/67801/discussion-between-rafael-almeida-and-rzvan-panda). – Rafael Almeida Dec 28 '14 at 17:21
0

I would suggest using reasoning logic in combination with compiler messages and test results would be a better way to understand if a function has overlapping patterns or not. As two examples, the first which has already been listed, indeed results in a compiler warning.

-- The first definition should work as expected.
last1 :: [a] -> a
last1 [x] = x
last1 (_:xs) = last xs

in the second case if we swap the last two lines around then a compiler error which states. Program error: pattern match failure: init1 [] results

last :: [a] -> a
last (_:xs) = last xs
last [x] = x

This matches the logic of passing a singleton list which could match in both patterns, and in this case the now second line.

last (_:xs) = last xs

will match in both cases. If we then move onto the second example

-- The first definition should work as expected
drop :: Int -> [a] -> [a]
drop 0 xs = xs
drop n [] = []
drop n (_:xs) = drop1 (n - 1) xs

In the second case if we again swap the last line with the first line then we don't get a compiler error but we also don't get the results we expect. Main> drop 1 [1,2,3] returns an empty list []

drop :: Int -> [a] -> [a]
drop n (_:xs) = drop1 (n - 1) xs
drop 0 xs = xs
drop n [] = []

In summary I think this is why reasoning (as oppose to a formal definition) for determining overlapping patterns works ok.

Faktor 10
  • 1,868
  • 2
  • 21
  • 29