3

So, I was looking for mutable hashtables (for speed reasons), and the updated answer from Don Stewart led me to try out hashtables.

Being a bit inexperienced in the ST Monad, I successfully expanded the given example to:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE FlexibleContexts #-}

import qualified Data.HashTable.ST.Cuckoo as C
import qualified Data.HashTable.Class as H
import Control.Monad.ST.Safe
import Data.Text

type HashTable s k v = C.HashTable s k v

foo :: ST s (HashTable s Text Int)
foo = do
  ht <- H.newSized 1
  H.insert ht "abc" 1
  H.insert ht "dea" 3
  return ht

add1 :: HashTable s Text Int -> ST s (HashTable s Text Int)
add1 ht = do
  H.insert ht "abc" 2
  H.insert ht "dea" 4
  return ht

main = do
  let x = runST $ H.toList =<< foo
  print x
  let y = runST $ H.toList =<< add1 =<< foo
  print y

From my (limited) understanding it allows to operate on data structures in a mutable way, but then 'freeze them' and present the result in such a way that it can be escaped from, through runST - and thus we can use the let bindings, and not <-.

However, I was failing to see how I could operate on an hashtable without always converting it to/from lists. The following code does not even compile:

   -- does not compile
   let z = runST foo
   let w = runST $ add1 z
   print w

It gives the following error (among others): hashtable.hs:31:19:

Couldn't match type `a' with `C.HashTable s Text Int'
  `a' is a rigid type variable bound by
      the inferred type of z :: a at hashtable01.hs:31:7
Expected type: ST s a
  Actual type: ST s (HashTable s Text Int)
In the second argument of `($)', namely `foo'
In the expression: runST $ foo
In an equation for `z': z = runST $ foo

This is probably due to the s constrain in the type, and my guess is that it's probably there for precisely that reason. If we would later use z it would not have the same value, since the add1 operated in place, and thus it would not be pure. Is this correct?

But then, the ST Monad in this particular case is only useful for a situation where:

  • You give a fixed input
  • You calculate the new value based on it, using mutable data structures
  • You freeze the result, making it imutable again.

Any further change requires a toList/fromList that efectively copies the values and keeps the original imutable. As I am writing this, I'm thinking - duh, this is the definition of a pure function, as used everywhere in haskell, and thus the use case for the ST Monad (have I reached ST enlightenment?)

So, i guess the real question in this case is: isn't this toList/fromList operation expensive (2xO(n)), and couldn't it there be another way to operate on a copy without the toList/fromList pair of functions. Or am I overcomplicating, and I should just use the IO version of Hashtables?

Community
  • 1
  • 1
jcristovao
  • 554
  • 2
  • 12
  • Please include all imports – Ankur Jul 19 '13 at 12:05
  • Ah, sorry about that... I've included the missing imports. – jcristovao Jul 19 '13 at 12:10
  • I don't think `runST $ foo >>= add1` marshal toList/fromList. Basically you can compose all the ST moands and no marshalling will happen when you execute the composed ST monad – Ankur Jul 19 '13 at 12:16
  • 2
    Yes, it's expensive, and the usual way of dealing with this is to pull everything else into the ST monad until you have a function which uses a hash table as an _internal_ implementation detail and doesn't expose that to the rest of the code. – Louis Wasserman Jul 19 '13 at 16:43
  • 1
    AFAIK there's no `STArray`-like freezing for hashtables. – n. m. could be an AI Jul 19 '13 at 19:47
  • Louis and n.m., that was precisely my question. If either of you would make it an answer, I will mark it as correct. Thanks! – jcristovao Jul 20 '13 at 08:28

1 Answers1

0

You are correct that you can't operate on a hashtable once it leaves the ST monad. The answer is not to do toList/fromList marshaling, but instead to just have a single runST that scopes over everything you need to do with mutating that table.

I.e. as Louis wrote in the comments: "pull everything else into the ST monad until you have a function which uses a hash table as an internal implementation detail and doesn't expose that to the rest of the code".

sclv
  • 38,665
  • 7
  • 99
  • 204