8

I am rather new to Haskell and after reading this and some performance tips on strictness I am still wondering how this applies to let and where expressions. If I have code like:

f :: Int -> Int -> Int
f a b
  |a==b = <simple computation>
  |otherwise = e1 + 2 * e1 - e1^2
  where e1 = <lengthy computation>

how often will <lengthy computation> be evaluated? I assume that given Haskell's lazy evaluation in e1 will not be evaluated at all if a==b. But if not, is e1 substituted in the otherwise expression and then evaluated every time it is encountered or is it evaluated once it is first encountered and then stored and reused in all subsequent occurrences? Also:

  • is there a way to control this process "manually"?
  • does this depend on weather I run code in ghci or compile it with GHC and within GHC compilation does it depend on flags like -o?

This is quite similar to this question but I could not find answers for Haskell.

Explanations are very appreciated.

Community
  • 1
  • 1
Chris
  • 710
  • 7
  • 15

2 Answers2

10

As a rule, code in the where or let block of a constant applicative form is evaluated only once, and only as deep as necessary (i.e., if it's not used at all it also won't be evaluated at all).

f is not a constant applicative form because it has arguments; it's equivalent to

f' = \a b -> let e1 = <lengthy computation>
             in if a==b
                 then <simple computation>
                 else e1 + 2 * e1 - e1^2

So, e1 is evaluated once every time you call the function with both arguments. This is likely also what you want, and in fact the best behaviour possible if <lengthy computation> depends on both a and b. If it, say, only depends on a, you can do better:

f₂ a = \b -> 
  if a==b then <simple computation>
           else e1 + 2 * e1 - e1^2
 where e1 = <lengthy computation>

This form will be more efficient when you do e.g. map (f 34) [1,3,9,2,9]: in that example, e1 would only be computed once for the entire list. (But <lengthy computation> won't have b in scope, so it can't depend on it.)

OTOH, there can also be scenarios where you don't want e1 to be kept at all. (E.g. if it occupies a lot of memory, but is rather quick to compute). In this case, you can just make it a “nullary function”

f₃ a b
  | a==b = <simple computation>
  | otherwise = e1() + 2 * e1() - e1()^2
 where e1 () = <lengthy computation>

Functions are not memoized by default, so in the above, <lengthy computation> is done zero times if a==b and three times else.

Yet another possibility is to force that e1 is always evaluated exactly once. You can do that with seq:

f₄ a b = e1 `seq` if a==b
          then <simple computation>
          else e1 + 2 * e1 - e1^2
 where e1 = <lengthy computation>

This is the only of the suggestions that actually changes something about the semantics, not just the performance: assume we define always e1 = error "too tough". Then f, f', f₂ and f₃ will all still work provided that a==b; however f₄ will even fail in that case.


As for optimisations (-O or -O2) – these generally won't change anything about the strictness properties of your program (i.e. can't change between the behaviour of f and f₄). Beyond that, the compiler is pretty much free to make any change it considers benefitial to performance. But usually, it will not change anything about what I said above. The main exception, as Taren remarks, is something like f₃: the compiler will readily inline e1 () and then share a reference to the computed value, which prevents the garbage collector from reclaiming the memory. So it's better not to rely on this (anyway somewhat hackish) technique.

leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
  • Thanks a lot, you say that "by default" functions are not memoized. If I wanted one to be how would I tell ghc? (also I edited the wrong type-signature) – Chris May 02 '17 at 12:55
  • That's a bit of a different topic, but one simple and reliable way to achieve memoisation of a function is to wrap it in [`memo`](http://hackage.haskell.org/package/MemoTrie-0.6.7/docs/Data-MemoTrie.html#v:memo) from the MemoTrie library. – leftaroundabout May 02 '17 at 12:59
  • 5
    If e1 doesn't depend on either a or b I'd expect ghc to float it out so it is memoized, at least with -O2. Edit: Even with -O it floats the entire rhs to the top level: `f2 :: Int; f2 = case $w$sg 10000# of ww { __DEFAULT -> I# (-# (+# ww (*# 2# ww)) (*# ww ww)) }` Ghc is very float happy to the point where it it can be quite hard to stop it from doing this. Just making e1 a function that takes unit will generally NOT be enough to stop it from being memoized like this. – Taren May 02 '17 at 15:03
  • 1
    @Taren: GHC sure does this when `e1` is just a small `Int` value in an `Int` that it can optimise the hell out of – but for these you wouldn't care about memory usage anyway. For big, indirection-heavy, non-inlineable computations, I think it's quite a bit less agressive. – leftaroundabout May 02 '17 at 15:24
  • 2
    @leftaroundabout Try `g i = [1..i]` and `e1 = g 10000000` with -O2. Ghc floated it out for me resulting in ~600mb peak memory usage. – Taren May 02 '17 at 15:39
  • Interesting, but what was the test case? The interesting question is not whether so much memory is consumed when first using the value, but whether merely holding a reference to something that might later use that value again prevents it from being garbage collected. – leftaroundabout May 02 '17 at 15:47
  • @leftaroundabout As test case I printed the result in main. Without the floating this could interleave garbage collection with printing, resulting in constant memory usage. If the reference is floated to the top level I don't think it would be garbage collected at all. Here is a quick and dirty comparison that might be broken: http://imgur.com/a/nRsuH – Taren May 02 '17 at 15:50
  • naming it and forcing it with `seq` is the way to guarantee it that's actually in the language spec (i.e. [the Report](https://www.haskell.org/onlinereport/basic.html)). `f5 a b = if a==b then else e1 `seq` (e1 + 2 * e1 - e1^2) where e1 = ` is the solution you want then. all the other tricky and compiler-dependent stuff is unreliable. – Will Ness May 03 '17 at 00:00
4

You can check actually check out how GHC will optimize your code:

ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes -fforce-recomp .\scratch.hs

This is a bit of a mouthful so you might want to alias it. The results of this very much depend on the optimization level so you might want to try this out with each.

With g i = sum [1..i] as expensive computation and -O2 I get this output:

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 64, types: 23, coercions: 0}

Rec {
-- RHS size: {terms: 16, types: 3, coercions: 0}
$wgo :: Int# -> Int# -> Int#
$wgo =
  \ (w :: Int#) (ww :: Int#) ->
    case w of wild {
      __DEFAULT -> $wgo (+# wild 1#) (+# ww wild);
      10000# -> +# ww 10000#
    }
end Rec }

-- RHS size: {terms: 15, types: 1, coercions: 0}
f2 :: Int
f2 =
  case $wgo 1# 0# of ww { __DEFAULT ->
  I# (-# (+# ww (*# 2# ww)) (*# ww ww))
  }

-- RHS size: {terms: 2, types: 0, coercions: 0}
f1 :: Int
f1 = I# 42#

-- RHS size: {terms: 17, types: 8, coercions: 0}
f :: Int -> Int -> Int
f =
  \ (a :: Int) (b :: Int) ->
    case a of _ { I# x ->
    case b of _ { I# y ->
    case tagToEnum# (==# x y) of _ {
      False -> f2;
      True -> f1
    }
    }
    }

Which is pretty ugly when compared to your haskell version but with a bit of squinting it isn't much more complicated. $wgo is our expensive function. The interesting part here is that f1 or f2, the possible return values of f, are only computed once when they are required for the first time. For the rest of the program run they are reused.

Taren
  • 674
  • 5
  • 11