8

I've just started learning Haskell and found a strange thing.

Let we have a list:

ghci> [0,2..5]
[0,2,4]

It has 3 elements. When I use map with this list I get 3 element as output, for example:

ghci> map (+ 1) [0,2..5]
[1,3,5]
ghci> map (* 2) [0,2..5]
[0,4,8]
ghci> map (`div` 2) [0,2..5]
[0,1,2]

But when I use fractional division I get 4 elements in output list:

ghci> map (/ 2) [0,2..5]
[0.0,1.0,2.0,3.0]
ghci> length (map (/ 2) [0,2..5])
4

Could you please explain why map may return more elements then it was?

Thank you!

Vadik Sirekanyan
  • 3,332
  • 1
  • 22
  • 29
  • 1
    Related: http://stackoverflow.com/q/7290438/2541573 – jub0bs Sep 03 '15 at 05:39
  • 1
    Note that `length (map f xs) == length (map f' xs')` for every `length xs == length xs'`. This *must* be true independently on the implementation of `map` since it derives from its type. `map` isn't able to distinguish between different types and hence decided how many elements to return. – Bakuriu Sep 03 '15 at 06:57

1 Answers1

11

It's due to the implementation of Enum for Float and Double:

> [0,2..5] :: [Float]
[0.0,2.0,4.0,6.0]

It's not map doing it, but Float. Specifically, if you call enumFromThenTo 0 2 5 :: [Float], you'll get the same list. You'll see the same results for Double.

This is hinted at in the haskell report, but the behavior is definitely non-obvious. Essentially, it comes down to the implementation of numericEnumFromThenTo (we're getting into some Haskell internals here), which is used by the Enum Float instance:

numericEnumFromThenTo n n' m = takeWhile p (numericEnumFromThen n n')  
    where  
        p | n' >= n   = (<= m + (n' - n) / 2)  
          | otherwise = (>= m + (n' - n) / 2) 

numericEnumFromThen n m = iterate (+ (m - n)) n

So you have numericEnumFromThen 0.0 2.0 generating the list [0.0,2.0,4.0,6.0,8.0,...], then you do takeWhile p on that, which in this case is equivalent to the function \x -> x <= 5.0 + (2.0 - 0.0) / 2, or more simply \x -> x <= 6.0, which is why 6.0 is included in the output list of [0.0,2.0..5.0].

I can't explain why it's implemented this way, that's pretty baffling to me too, but hopefully I've answered the how for its implementation.

bheklilr
  • 53,530
  • 6
  • 107
  • 163
  • 5
    The why is that it's important that `[0,0.1..1]` always have eleven elements, whereas it doesn't really matter how many elements `[0,2..5]` has since it's nonsense in the first place. – Reid Barton Sep 03 '15 at 03:52
  • 2
    To elaborate on @ReidBarton, the implementation of floating point ranges is designed so that the length is insensitive to rounding errors in the common case when the difference between start and end point is *approximately* a whole number of steps. The cutoff is instead put at the unlikely "opposite" case where there's a half step excess. – Ørjan Johansen Sep 03 '15 at 10:59