How is this memoized DP table too slow for SPOJ?

Question

SPOILERS: I'm working on http://www.spoj.pl/problems/KNAPSACK/ so don't peek if you don't want a possible solution spoiled for you.

The boilerplate:

import Data.Sequence (index, fromList)
import Data.MemoCombinators (memo2, integral)

main = interact knapsackStr

knapsackStr :: String -> String
knapsackStr str = show $ knapsack items capacity numItems
  where [capacity, numItems] = map read . words $ head ls
        ls = lines str
        items = map (makeItem . words) $ take numItems $ tail ls

Some types and helpers to set the stage:

type Item = (Weight, Value)
type Weight = Int
type Value = Int

weight :: Item -> Weight
weight = fst

value :: Item -> Value
value = snd

makeItem :: [String] -> Item
makeItem [w, v] = (read w, read v)

And the primary function:

knapsack :: [Item] -> Weight -> Int -> Value
knapsack itemsList = go
  where go = memo2 integral integral knapsack'
        items = fromList $ (0,0):itemsList
        knapsack' 0 _ = 0
        knapsack' _ 0 = 0
        knapsack' w i | wi > w    = exclude
                      | otherwise = max exclude include
          where wi = weight item
                vi = value item
                item = items `index` i
                exclude = go w (i-1)
                include = go (w-wi) (i-1) + vi

And this code works; I've tried plugging in the SPOJ sample test case and it produces the correct result. But when I submit this solution to SPOJ (instead of importing Luke Palmer's MemoCombinators, I simply copy and paste the necessary parts into the submitted source), it exceeds the time limit. =/

I don't understand why; I asked earlier about an efficient way to perform 0-1 knapsack, and I'm fairly convinced that this is about as fast as it gets: a memoized function that will only recursively calculate the sub-entries that it absolutely needs in order to produce the correct result. Did I mess up the memoization somehow? Is there a slow point in this code that I am missing? Is SPOJ just biased against Haskell?

I even put {-# OPTIONS_GHC -O2 #-} at the top of the submission, but alas, it didn't help. I have tried a similar solution that uses a 2D array of Sequences, but it was also rejected as too slow.

Have you tried profiling? In general, I haven't found that memoisation is that big a help for most tasks. Also, using a sequence doesn't seem to be much help here, maybe a `Map` instead? — ivanm, Sep 22 '11 at 03:41
Can you show us the full code with the necessary bits of memotries provided? That would make it easier to diagnose the problem. — Max Bolingbroke, Sep 22 '11 at 05:20
The `index` operation is `O(log(min(i,n-i)))` (according to haddock). Perhaps you should use an Array or Vector. — Thomas M. DuBuisson, Sep 22 '11 at 06:00
[hpaste of my precise SPOJ 0-1 knapsack submission](http://hpaste.org/51675) — Dan Burton, Sep 22 '11 at 18:39
It's not spoj, it's haskell where you can't reuse what you have. Immutability is cool, but not for heavy dp. Row by row computation with mutable arrays should work but sure why haskell then. [Also 2s even to create 2000x2000 array](http://ideone.com/IkiK3), and spoj machines are much slower. — gorlum0, Sep 22 '11 at 19:02
@gorlum0 this algorithm doesn't require mutation, though. It's simply a DP table where one cell depends on values from other cells. It's not like a Bellman-Ford algorithm which is iteratively accessing and mutating a table. I ran my Haskell, Java, and Scala code (I solved it in all 3 langs, only the Java solution was accepted as within the time limit) with the given example input on ideone.com, and it said the Haskell took 0s, while Java took 0.07s and Scala took 0.19s. — Dan Burton, Sep 22 '11 at 23:00
IIRC hlint prefers `{-# LANGUAGE -O2 #-}` over `{-# OPTIONS_GHC -O2 #-}` so you could try using that instead. — Tyler, Sep 23 '11 at 00:38
@DanBurton: Sure it doesn't but without it you need all these thunks or huge dp array. And about time - [mine](http://ideone.com/ZU3ec) and [yours](http://ideone.com/ciBMS) on max-sized custom input (rewrote what already had in haskell). — gorlum0, Sep 24 '11 at 20:10
If performance matters, don't use `read` to convert String to Int(eger). It's very slow. — Jogusa, Sep 26 '11 at 17:22
@Jogusa: Indeed, but not for this one I believe - only 4002 input values max. — gorlum0, Sep 27 '11 at 16:22
In my experience memotries is slower than `Data.HashMap`, `Data.Map`, and `Data.IntMap`. — tibbe, Nov 10 '11 at 21:26

score 4 · Accepted Answer · answered Oct 18 '11 at 14:02

There's one major problem which really slows this down. It's too polymorphic. Type-specialized versions of functions can be much faster than polymorphic varieties, and for whatever reason GHC isn't inlining this code to the point where it can determine the exact types in use. When I change the definition of integral to:

integral :: Memo Int
integral = wrap id id bits

I get an approximately 5-fold speedup; I think it's fast enough to be accepted on SPOJ.

This is still significantly slower than gorlum0's solution however. I suspect the reason is because he's using arrays and you use a custom trie type. Using a trie will take much more memory and also make lookups slower due to extra indirections, cache misses, etc. You might be able to make up a lot of the difference if you strictify and unbox fields in IntMap, but I'm not sure that's possible. Trying to strictify fields in BitTrie creates runtime crashes for me.

Pure haskell memoizing code can be good, but I don't think it's as fast as doing unsafe things (at least under the hood). You might apply Lennart Augustsson's technique to see if it fares better at memoization.

score 0 · Answer 2 · answered Oct 22 '16 at 17:43

0

The one thing that slows down Haskell is IO, The String type in Haskell gives UTF8 support which we don't need for SPOJ. ByteStrings are blazing fast so you might want to consider using them instead.

answered Oct 22 '16 at 17:43

Vasantha Ganesh

4,570
3
25
33

How is this memoized DP table too slow for SPOJ?

2 Answers2