4

I'm following the Stanford Algorithms MOOC and trying to implement the problems using Haskell. Many of the algorithms require quite a bit of data juggling, and pure solutions are running way slower than benchmarks people are quoting for imperative languages. So I feel i need to use mutable data structures.

Most Haskell data structures seem to have mutable equivalents, but with lots of warnings about using them. Implementing these through an IO Monad seems to be readily possible, but I get the impression that the 'purer' way is to use an ST Monad, but in the latter case I only see Arrays referred to, where runSTArray is available (see code below). Am I missing something?

main = do
    let
        inputData = [1,10,4,3,2]
    print $ elems $ runSTArray $ do
        --newListArray :: (MArray a e m, Ix i) => (i, i) -> [e] -> m (a i e)
        state <- newListArray (1, length inputData) inputData
        qsort state 1 (length inputData)
        return state

qsort :: (STArray s Int Int) -> Int -> Int -> ST s ()
qsort arr min mx =
    if mx - min < 1 then
        return ()

    else do
        p <- readArray arr min
        final_i <- foldM (partitioner p) (min+1) [(min+1)..mx]
        swap min $ final_i - 1
        qsort arr min     (final_i-2)
        qsort arr final_i mx     

    where
        swap i j = do
            arr_i <- readArray arr i
            arr_j <- readArray arr j
            writeArray arr i arr_j
            writeArray arr j arr_i

        partitioner p i idx = do
            arr_idx <- readArray arr idx
            if arr_idx > p then
                return i
            else do
                swap i idx
                return $ i+1

Concretely, what is the effective difference between the above and:

main = do
    [snip]
    arr <- newListArray (0, length inputData - 1) inputData :: IO (IOArray Int Int)
    qsort arr 0 (length inputData - 1)
    printArray arr

qsort :: (IOArray Int Int) -> Int -> Int -> IO ()
[As above]

Thinking some more: is this question no more than the difference between the ST and IO Monad, where the lay answer is that ST is 'safer'?

Simon H
  • 20,332
  • 14
  • 71
  • 128
  • 1
    What data structure do you want to use? A vector? `Data.Vector.Unboxed.Mutable` works in `ST` and `IO`. A map? `Data.Map` is pretty fast, and if you need mutable values _in_ the map, you can simply use `STRef`. A map from integral values? `Data.IntMap` is even faster. You should add a specific problem to your question, otherwise it might be too broad. – Zeta Oct 26 '14 at 08:49
  • "Most Haskell data structures seem to have mutable equivalents". They do? – Tom Ellis Oct 26 '14 at 11:11
  • Its also worth considering that often the jump from List to Vector isn't justified and you could get away with a Sequence for example, which is both high performant and pure. Also, use Vector instead of Array. – alternative Oct 26 '14 at 14:12
  • @alternative I'll look into that. Meanwhile, I now see how to use ST with Vectors: `print $ elems $ runST $ do [...]; frozen <- freeze state; return frozen`. All the examples I found with data structures use `runSTArray`, which rather tricked me – Simon H Oct 26 '14 at 14:28
  • @SimonH I think that Vector more or less answers your question too. If you look at the types, it doesn't run in ST or IO, it runs in the PrimMonad typeclass, which has instances for both ST and IO. So you can test the same thing in both with the same code, and you'll probably see that they are equally performant. – alternative Oct 26 '14 at 14:34

1 Answers1

2

Yes you are missing the module Data.STRef. You can create mutable variables using newSTRef, store variables using writeSTRef and fetch with readSTRef.

ArunasR
  • 1,907
  • 1
  • 14
  • 15