6

I am finding the list of [0.1..1] that will be returned from Haskell and I do not understand why it is [0.1,1.1]. Can anyone provide explanation for me please?

Rodrigo de Azevedo
  • 1,097
  • 9
  • 17
renW
  • 87
  • 4

1 Answers1

10

TL;DR: don't use [x..y] on floating point numbers. You might get unexpected results.

On floating point numbers, there's no sane semantics for [x..y]. For instance, one might argue that the semantics should be [x,x+1,x+2,...,x+n] where x+n is the largest value of that form which is <=y. However, this does not account for floating point rounding errors. It is possible that x+n produces a slightly larger value than the exact y, making the list shorter than expected. Hence this semantics makes the value of length [x..y] rather unpredictable.

Haskell tries to mitigate this issue, by allowing an error up to 0.5. The rationale is as follows: when x+n is closer to y than to y+1, it should be regarded as some value in the interval [x..y] which got rounded to something larger. Arguable, but this is how Haskell works.

In enumerations like [x,y .. z] with an explicit stepping (e.g. [0.0,5.0 .. 1000.0]) Haskell instead allows an error of (y-x)/2 (2.5 in the example). The rationale is the same: we include those points which are closer to 1000 than to 1000+5.

You can find all the gory details in the Haskell Report which defines the semantics of Haskell. This part is also relevant.

This is generally seen by Haskellers as a small wart in the language. Some argue that we should not have mandated Enum Float and Enum Double. Removing those instances would effectively prohibit the troublesome cases like [1.0 .. 5.0] or the much worse [1.0 .. 5.5] (which is again numerically unstable).

chi
  • 111,837
  • 3
  • 133
  • 218
  • Yes. The thing that _would_ actually make sense for `Float` is some `[x ..{n}.. y]` syntax as shorthand for `[x + k*(y-x)/n | k<-[0..n-1]]`, equivalent to `linspace` which is ubiquitous in the Matlab / NumPy etc. communities. Alternatively, `[x..y]` should denote the interval of _all_ numbers between `x` and `y`, but this is utterly impractical if the result has to be a list. – leftaroundabout Oct 06 '21 at 12:49
  • ...although, thinking of it, we could enumerate all the rationals in the interval... what could possibly go wrong... or `[x..y] = let m = (x+y)/2 in m : concat (zipWith (\x y->[x,y]) [x..m] [m..y])` – leftaroundabout Oct 06 '21 at 12:59
  • @leftaroundabout There are indeed several variants, each with its own ups and downs. I wonder if we really need to choose just one as "the" definition of `[x..y]` or simply leave `[x..y]` undefined and offer all the variants as library functions. – chi Oct 06 '21 at 13:12
  • @leftaroundabout wouldn't `[x] ++ [x*(n-k)/n+y*k/n | k <-[1..n-1]] ++ [y]` be better? (for the `n > 0` case) – Will Ness Oct 06 '21 at 13:13
  • @WillNess or `[x + k*(y-x)/(n-1) | k<-[0..n-2]] ++ [y]`, which is actually what I meant (and is the default behaviour of `linspace`). – leftaroundabout Oct 06 '21 at 13:17
  • @leftaroundabout that does make more sense. :) – Will Ness Oct 06 '21 at 13:22
  • If float always a requirement? `[0.1 :: Rational, 0.2.. 1]` seems reasonable. – pedrofurla Oct 06 '21 at 14:45
  • 1
    @pedrofurla I don't see any issues with `Rational` or any other numeric type having infinite precision (no rounding errors). – chi Oct 06 '21 at 17:07
  • 1
    @chi Why do you have an issue with `[0.1 .. 1] :: [Double]` but not `[0.1 .. 1] :: Rational`? Their outputs are essentially identical (`[0.1, 1.1]` vs `[1%10, 11%10]`). – Daniel Wagner Oct 06 '21 at 19:24
  • @DanielWagner Good point, I did not realize that rationals worked in such way too. Yuck. I now have an issue with both. Rationals are slightly better since `[0.5 .. 10.0]` is predictable, but that's still bad, IMO. – chi Oct 06 '21 at 20:59
  • uh? Here I got `[1 % 10,1 % 5,3 % 10,2 % 5,1 % 2,3 % 5,7 % 10,4 % 5,9 % 10,1 % 1]`. Oh, my example had the second element defined. – pedrofurla Oct 07 '21 at 01:36