Problem starts from a algorithm problem from codeforces, here is the link Writing Code. The problem is a basic DP problem which solution could be found at Solution Report.
I try to solve it using Haksell, my solution is recurse with memoization, which time complexity is also O(n^3)
. Here is my code:
{-# OPTIONS_GHC -O2 -funbox-strict-fields #-}
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, BangPatterns #-}
import Data.Functor
import Data.List
class Memo m a where
memo :: m -> a
instance Memo m m where
memo = id
instance Memo m a => Memo (Int -> m) [a] where
memo f = map (memo.f) [0..]
solve :: Int -> Int -> Int -> Int -> [Int] -> Int
solve n m b mod arr = modsum.map (go n m) $ [0 .. b]
where mods = (`rem` mod)
modsum = foldl1' (\a b -> mods (a + b))
go :: Int -> Int -> Int -> Int
go _ 0 0 = 1
go _ 0 _ = 0
go 0 _ _ = 0
go i j k = mods $ a + b
where !id = i - 1
!ai = arr !! id
!a = (if k >= ai then memo_go !! i !! (j - 1) !! (k - ai) else 0)
!b = memo_go !! id !! j !! k
memo_go = memo go
main = do [n, m, b, mod] <- map read.words <$> getLine
arr <- map read.words <$> getLine
print $ solve n m b mod arr
And the performance report of this program using RTS
is:
Fri May 22 12:25 2015 Time and Allocation Profiling Report (Final)
test.exe +RTS -p -RTS
total time = 5.75 secs (5751 ticks @ 1000 us, 1 processor)
total alloc = 189,957,112 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
solve.go Main 97.2 8.5
memo Main 2.3 82.8
solve.mods Main 0.5 8.5
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 52 0 0.0 0.0 100.0 100.0
main Main 105 0 0.0 0.3 100.0 100.0
solve.modsum Main 108 1 0.0 0.0 0.0 0.0
solve Main 106 1 0.0 0.0 100.0 99.7
solve.memo_go Main 114 1 0.0 0.0 99.9 99.7
memo Main 115 10301 2.3 82.8 99.9 99.7
solve.go Main 118 1024999 97.2 8.5 97.6 16.9
solve.mods Main 119 0 0.5 8.5 0.5 8.5
solve.go Main 113 101 0.0 0.0 0.0 0.0
solve.mods Main 120 0 0.0 0.0 0.0 0.0
solve.mods Main 111 1 0.0 0.0 0.0 0.0
solve.modsum Main 109 0 0.0 0.0 0.0 0.0
solve.modsum.\ Main 110 100 0.0 0.0 0.0 0.0
solve.mods Main 112 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.CodePage 91 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 84 0 0.0 0.0 0.0 0.0
CAF Text.Read.Lex 81 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 72 0 0.0 0.0 0.0 0.0
CAF:main1 Main 68 0 0.0 0.0 0.0 0.0
main Main 104 1 0.0 0.0 0.0 0.0
CAF:lvl2_r3ul Main 64 0 0.0 0.0 0.0 0.0
CAF:$fMemo(->)[]1 Main 61 0 0.0 0.0 0.0 0.0
memo Main 116 0 0.0 0.0 0.0 0.0
CAF:$fMemomm_$cmemo Main 59 0 0.0 0.0 0.0 0.0
memo Main 117 1 0.0 0.0 0.0 0.0
Predictably, the bottleneck is the memo_go
. The memo_go is a 3-D array, in which every element is a thunk go i j k
, and it would only be evaluated once. However, this code got a TLE from the judge.
I wander how to improve the performance of this code?
Updata I change the List to Array and got an MLE, so I use rolling-array to reduce the memory, here is my code:
{-# OPTIONS_GHC -O2 -funbox-strict-fields #-}
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances, BangPatterns #-}
import Data.Functor
import Data.List (foldl', foldl1')
import Data.Array
--import qualified Data.Vector as V
size = 510
class Memo m a where
memo :: m -> a
instance Memo m m where
memo = id
instance Memo m a => Memo (Int -> m) (Array Int a) where
memo f = fmap (memo.f) $ listArray (0, size) [0..size]
solve :: Int -> Int -> Int -> Int -> [Int] -> Int
solve n m b mod lis = modsum.map (res ! m !) $ [0 .. b]
where arr = listArray (0, n - 1) lis
mods = (`rem` mod)
modsum = foldl1' (\a b -> mods (a + b))
init :: Int -> Int -> Int
init 0 0 = 1
init _ _ = 0
arr_init = memo init
dp n = foldl' build arr_init [1..n]
where build ar idx = memo_go'
where !ai = arr ! (idx - 1)
memo_go' = memo go'
go' i j = mods $ a + b
where !a = ar ! i ! j
!b = if i > 0 && j >= ai then memo_go' ! (i - 1) ! (j - ai) else 0
res = dp n
main = do [n, m, b, mod] <- map read.words <$> getLine
arr <- map read.words <$> getLine
print $ solve n m b mod arr
Still MLE, maybe I should use the mutable data structrue :(