I'm trying to optimize the execution speed of my program, and I ran into some interesting results that I'm hoping someone can answer. It seems that making small changes in one of my list comprehensions drastically changes the execution speed, but I don't know why.
Here's my program as it is right now.
import Data.Ord
import Control.Monad
import Data.Array
import Data.Ix
import qualified Data.Map as M
import qualified Data.Set as S
import Data.List (minimumBy, foldl')
arrayMatrix lists = let rlen = length lists
clen = length $ head lists
r = ((1,1), (rlen, clen))
in array r . zip (range r) $ concat lists
a_star start goal h m = search S.empty (S.singleton start)
(M.singleton start (m ! start))
$ M.singleton start (m ! start + h ! start)
where neighbors (r,c) = filter (inRange $ bounds m) [ (r-1,c), (r,c+1), (r+1,c) , (r,c-1)]
search closed open gs fs
| S.null open = 0
| current == goal = gs M.! goal
| otherwise = let open' = S.delete current open
closed' = S.insert current closed
neighbs = [(n, ts) | n <- neighbors current, S.notMember n closed
, let ts = gs M.! current + m ! n ]
actionable = filter (\(n,ts) -> S.notMember n open' || ts < (gs M.! n)) neighbs
(op',gs',fs') = foldl' (\(o,ng,nf) (n,ts) -> (S.insert n o, M.insert n ts ng, M.insert n (ts + h ! n) nf)) (open',gs,fs) actionable
in search closed' op' gs' fs'
where current = minimumBy (comparing (fs M.!)) $ S.toList open
main = do
matrix <- liftM (arrayMatrix . map (read . ('[':) . (++"]")) . lines)
$ readFile "matrix.txt"
let bds = bounds matrix
ulim = snd bds
heuristic = let m = minimum $ elems matrix
in listArray bds . map (\(r,c) -> (uncurry (+) ulim)-r-c) $ range bds
print $ a_star (1,1) ulim heuristic matrix
Right now the program runs on my computer ~350ms (compiled with GHC 7.8.2 -O2) with the matrix.txt supplied by Project Euler.
If I change neighbs from
neighbs = [(n, ts) | n <- neighbors current, S.notMember n closed
, let ts = gs M.! current + m ! n ]
to
neighbs = [(n, gs M.! current + m ! n) | n <- neighbors current, S.notMember n closed]
the execution time increases to over 1sec.
Other minor changes like moving the filter on the next line into the list comprehension yields the same result: ~1sec.
Can anyone explain why this happens?
EDIT: It seems this doesn't happen on earlier versions of GHC. I tried GHC 7.6.3 and each of these performed about the same.
I've included the dumps from running ghc -O2 -ddump-simpl -dsuppress-all
as suggested by cdk. I don't really know what I'm looking at, so if anyone is able to interpret, that would be a big help, thanks.
EDIT2 (Response to Priyatham): I don't think that's the case. I changed
neighbs = [(n, ts) | n <- neighbors current, S.notMember n closed
, let ts = gs M.! current + m ! n ]
actionable = filter ((n,ts) -> S.notMember n open' || ts < (gs M.! n)) neighbs
to
neighbs = [(n, gs M.! current + m ! n) | n <- neighbors current, S.notMember n closed ]
actionable = filter ((n,!ts) -> S.notMember n open' || ts < (gs M.! n)) neighbs
using BangPatterns, and that still runs at a little over a second. In fact, modifying neigbs from
neighbs = [(n, ts) | n <- neighbors current, S.notMember n closed
, let ts = gs M.! current + m ! n ]
to
neighbs = [(n, ts) | n <- neighbors current, S.notMember n closed
, let !ts = gs M.! current + m ! n ] -- Added bang before ts
increases the runtime to over 1sec as well.