11

First of all, my apologies for the non-descriptive title. Since I have no idea what's actually going on I can't really make it any more specific.

Now for my question. I have implemented the following snippet for problem 23 of the 99 Haskell problems, which should randomly select n items from a list:

rndSelect' :: RandomGen g => [a] -> Int -> g -> ([a], g)
rndSelect' _ 0 gen = ([], gen)
rndSelect' [] _ _ = error "Number of items requested is larger than list"
rndSelect' xs n gen = ((xs !! i) : rest, gen'')
                    where (i, gen') = randomR (0, length xs - 1) gen
                          (rest, gen'') = (rndSelect' (removeAt xs i) (n - 1) gen')

rndSelectIO' :: [a] -> Int -> IO [a]
rndSelectIO' xs n = getStdRandom $ rndSelect' xs n

removeAt :: [a] -> Int -> [a]
removeAt xs n
  | length xs <= n || n < 0 = error "Index out of bounds"
  | otherwise = let (ys, zs) = splitAt n xs
                    in ys ++ (tail zs)

Now when I load this in ghci this works correctly for valid arguments:

*Main> rndSelectIO' "asdf" 2 >>= putStrLn 
af

However, strange things happen when I use an index that is out of bounds:

*Main> rndSelectIO' "asdf" 5 >>= putStrLn
dfas*** Exception: Number of items requested is larger than list
*Main> rndSelectIO' "asdf" 2 >>= putStrLn
*** Exception: Number of items requested is larger than list

As you can see, the following 2 (for me) unexpected things happen:

  1. Instead of directly giving an error it first print a permutation of the input.
  2. After it has given an error once, it won't execute at all anymore.

I suspect that 1. has to do with lazy evaluation, but I have absolutely no clue why 2. happens. What's going on here?

mb14
  • 22,276
  • 7
  • 60
  • 102
Tiddo
  • 6,331
  • 6
  • 52
  • 85
  • Each line in the interactive session is inside the same implicit `do` statement, which never ends, so every line passes its result to the next. – chepner Jun 12 '15 at 13:42

1 Answers1

13

The getStdRandom function basically looks up a StdGen value in a global variable, runs some function on it, puts the new seed back into the global variable, and returns the result to the caller.

If the function in question returns with an error, that error gets put into the global variable. Now all attempts to use this global variable will throw an exception. (I told you global variables are evil! ;-))

Try calling getStdGen manually yourself. It will either print out the current random seed, or throw an exception. If it throws an exception... there's your problem.

I believe you can use setStdGen to reset the thing.

MathematicalOrchid
  • 61,854
  • 19
  • 123
  • 220
  • So it's a laziness problem right? `getStdRandom` should not be able to put back an error in `StdGen`. – mariop Jun 12 '15 at 11:50
  • @mariop Arguably allowing a single global variable for this is a hack in the first place (e.g., thread safety could be threatened). But yeah, I guess you could argue it should do some hoopy thing to avoid reinserting an exception into the global variable... – MathematicalOrchid Jun 12 '15 at 11:55
  • How is this dealt with in real world applications? Does everyone initialize it's own random generator all the time? And is this a deliberate design decision, or was it an oversight not to catch errors in `getStdRandom`? – Tiddo Jun 12 '15 at 12:02
  • 1
    You don't generally ever use `error` in real world applications. You would return a `Maybe` or `Either` value instead. In real applications you also usually use some other randomness source than `StdGen`. – shang Jun 12 '15 at 12:26
  • 1
    Ideally yes, you manage your own random seed rather than relying on a hidden global variable shared between all code in the entire program. (And `System.Random` generally isn't very performant; there are other libraries for that, implementing Mersenne-twister or MWC or whatever.) – MathematicalOrchid Jun 12 '15 at 13:04