2
let strings = ["foo", "bar", "baz"]
in  [filter (== char) (concat strings) | char <- "oa"]

Does GHC evaluate concat strings when char == 'o' and then again when char == 'a'? Or does it remember that concat strings == "foobarbaz" for later use?

I realize I could avoid that uncertainty by refactoring this code, but I am interested in how the code compiles.

Jordan
  • 4,510
  • 7
  • 34
  • 42
  • List comprehensions in Haskell are just syntactic sugar for the list monad. So your above code is equivalent to `"oa" >>= (\char -> return $ filter (== char) (concat strings))`. Hence `concat strings` will be evaluated twice. – Aadit M Shah Dec 01 '13 at 06:49
  • `let strings = concat ["foo", "bar", "baz"] in ...` and your problem is solved! – kqr Dec 01 '13 at 08:38
  • 2
    !! It depends on the optimization level in GHC, see my answer below – Chris Kuklewicz Dec 01 '13 at 09:55

3 Answers3

4

GHC may evaluate it far fewer than 9 times. In fact it does so, and we can prove that using Debug.Trace.trace

module Main (main) where 
import Debug.Trace

x = let strings = ["foo", "bar", "baz"]
    in  [filter (== char) (trace "\nconcat strings\n" (concat strings)) | char <- "oaxyz"]

main = do
    print x

Here is evaluates for "oaxyz" 5 times for -O0, and once for -O1 and -O2:

! 529)-> touch LC.hs ; ghc -O0 -o lc0 LC.hs 
[1 of 1] Compiling Main             ( LC.hs, LC.o )
Linking lc0 ...

(! 530)-> touch LC.hs ; ghc -O1 -o lc1 LC.hs 
[1 of 1] Compiling Main             ( LC.hs, LC.o )
Linking lc1 ...

(! 531)-> ./lc0; ./lc1; ./lc2

concat strings


concat strings


concat strings


concat strings


concat strings

["oo","aa","","","z"]

concat strings

["oo","aa","","","z"]

concat strings

["oo","aa","","","z"]
Chris Kuklewicz
  • 8,123
  • 22
  • 33
2

What Chris says in his reply is all true. While you cannot completely rely on the expression to be shared, it's a valid optimization for GHC to share it, and it will usually do so when optimizations are enabled. Nevertheless, you might not want to rely on this functionality, and if you can make the intended sharing explicit by lifting the concat call out of the lambda, I'd so so.

Using Debug.Trace.trace for such purposes is a good way to get an idea about when and how often things are evaluated.

Another option is to look at the generated core code. For this program:

main = print x

x = let strings = ["foo", "bar", "baz"]
    in  [filter (== char) (concat strings) | char <- "oa"]

Let's compile wihout optimisations and look at the resulting code:

$ ghc NrEval -fforce-recomp -ddump-simpl -dsuppress-all
[1 of 1] Compiling Main             ( NrEval.hs, NrEval.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 50, types: 54, coercions: 0}

main
main =
  print
    ($fShow[] ($fShow[] $fShowChar))
    (let {
       a_ssN
       a_ssN = unpackCString# "foo" } in
     let {
       a1_ssQ
       a1_ssQ = unpackCString# "bar" } in
     let {
       a2_ssT
       a2_ssT = unpackCString# "baz" } in
     let {
       a3_ssU
       a3_ssU = : a2_ssT ([]) } in
     let {
       a4_ssV
       a4_ssV = : a1_ssQ a3_ssU } in
     let {
       strings_ahk
       strings_ahk = : a_ssN a4_ssV } in
     letrec {
       ds_dsE
       ds_dsE =
         \ ds1_dsF ->
           case ds1_dsF of _ {
             [] -> [];
             : ds3_dsG ds4_dsH ->
               : (filter
                    (\ ds5_dsI -> == $fEqChar ds5_dsI ds3_dsG) (concat strings_ahk))
                 (ds_dsE ds4_dsH)
           }; } in
     ds_dsE (unpackCString# "oa"))

main
main = runMainIO main

We can see how even without optimisations the list comprehension has been desugared into an (inlined) application of map in ds_dsE. However, concat strings_ahk remains under a lambda (ds1_dsF), which means it will be evaluated every time the function is evaluated, which is twice: once in the call to ds_dsE to the string "oa", and once in the recursive call ds_dsE ds4_dsH.

Now let's compare the results with optimisations:

$ ghc NrEval -fforce-recomp -ddump-simpl -dsuppress-all -O
[1 of 1] Compiling Main             ( NrEval.hs, NrEval.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 89, types: 105, coercions: 9}

main9
main9 = unpackCString# "foo"

main8
main8 = unpackCString# "bar"

main7
main7 = unpackCString# "baz"

main6
main6 = : main7 ([])

main5
main5 = : main8 main6

main_strings
main_strings = : main9 main5

main4
main4 =
  \ ds_dsT ds1_dsS ->
    : (letrec {
         go_ato
         go_ato =
           \ ds2_atp ->
             case ds2_atp of _ {
               [] -> [];
               : y_atu ys_atv ->
                 let {
                   z_XtU
                   z_XtU = go_ato ys_atv } in
                 letrec {
                   go1_XtX
                   go1_XtX =
                     \ ds3_XtZ ->
                       case ds3_XtZ of _ {
                         [] -> z_XtU;
                         : y1_Xu6 ys1_Xu8 ->
                           case y1_Xu6 of wild2_atz { C# c1_atB ->
                           case ds_dsT of _ { C# c2_atF ->
                           case eqChar# c1_atB c2_atF of _ {
                             False -> go1_XtX ys1_Xu8;
                             True -> : wild2_atz (go1_XtX ys1_Xu8)
                           }
                           }
                           }
                       }; } in
                  go1_XtX y_atu
              }; } in
       go_ato main_strings)
      ds1_dsS

main3
main3 = unpackFoldrCString# "oa" main4 ([])

main2
main2 = showList__ $fShowChar_$cshowList main3 ([])

main1
main1 = \ eta_B1 -> hPutStr2 stdout main2 True eta_B1

main
main = main1 `cast` ...

main10
main10 = \ eta_Xp -> runMainIO1 (main1 `cast` ...) eta_Xp

main
main = main10 `cast` ...

Here, we can see that a lot has happened, but in particular, the call to concat strings has been lifted to the top-level and unfolded completely at compile time, resulting in main_strings pointing to the concatenated three-element list of strings. This is clearly shared when go_ato is invoked with main_strings.

kosmikus
  • 19,549
  • 3
  • 51
  • 66
1

It will be evaluated twice. GHC doesn't memoize. list comprehensions desugar like this

 [term | x <- xs, y <- ys ...] -- I ignore guards

 do
  x <- xs
  y <- ys
  ...
  return term

Which is the same as

 flip concatMap xs $ \x -> flip concatMap ys $ \y -> ... [term]

So it's clear that term is going to be computed l1 * l2 * l3 ... ln times where li is the length of the ith list.

Community
  • 1
  • 1
daniel gratzer
  • 52,833
  • 11
  • 94
  • 134