5

Lets say I have the following:

data FuncAndValue v res = FuncAndValue (v -> res) v

chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v  

Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?

Basically, does storing functions in data types in anyway inhibit optimizations.

I'd like GHC to be easily able to compose chains of functions into one (i.e. so a "sum" on my structure doesn't involve repeated calls to a thunk that represents (+) and instead just inlines the (+) so it runs like a for loop. I'm hoping storing functions in data types and then accessing them later doesn't inhibit this.

Clinton
  • 22,361
  • 15
  • 67
  • 163
  • 2
    Yes, assuming that the implementations of `new_f` and `old_f` are available to the inliner. I.e. in this case it won't because they are parameters, but if you use `chain` in a context where the two functions are known and safe to inline, chain will be inlined and thus the composition inlined and optimized. I'm sure a GHC core guru will be able to prove it (or debunk me) – luqui Jan 06 '13 at 07:05
  • Write your program and worry about optimisation after profiling and finding actual bottlenecks. The sum of your structure thing is more of a strictness issue than inlining, fyi. – Cat Plus Plus Jan 06 '13 at 07:06
  • 2
    @CatPlusPlus: I'm writing a library, not a program, so I think it's impractical to test it with every program it will be run under. So I'd like some general guidelines to go by. – Clinton Jan 06 '13 at 07:28
  • 2
    What you have is the [Store comonad](http://hackage.haskell.org/packages/archive/comonad-transformers/3.0/doc/html/Control-Comonad-Trans-Store.html). (See also [What is the Store comonad?](http://stackoverflow.com/q/8766246/1333025)). And your `chain` function is its `Functor` instance. – Petr Jan 06 '13 at 16:25

1 Answers1

4

Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?

Yes, if it could do the same without the intervening FuncAndValue. Of course the unfoldings of the functions need to be available, or there wouldn't be any chance of inlining anyway. But if there is a chance, wrapping the function(s) in a FuncAndValue makes little difference if any.

But let's ask GHC itself. First the type and a very simple chaining:

module FuncAndValue where

data FuncAndValue v res = FuncAndValue (v -> res) v

infixr 7 `chain`

chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v

apply :: FuncAndValue v res -> res
apply (FuncAndValue f x) = f x

trivia :: FuncAndValue Int (Int,Int)
trivia = FuncAndValue (\x -> (2*x - 1, 3*x + 2)) 1

composed :: FuncAndValue Int Int
composed = chain (uncurry (+)) trivia

and (the interesting part of) the core we get for trivia and composed:

FuncAndValue.trivia1 =
  \ (x_af2 :: GHC.Types.Int) ->
    (case x_af2 of _ { GHC.Types.I# y_agp ->
     GHC.Types.I# (GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
     },
     case x_af2 of _ { GHC.Types.I# y_agp ->
     GHC.Types.I# (GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2)
     })

FuncAndValue.composed2 =
  \ (x_agg :: GHC.Types.Int) ->
    case x_agg of _ { GHC.Types.I# y_agp ->
    GHC.Types.I#
      (GHC.Prim.+#
         (GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
         (GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2))
    }

Inlined fair enough, no (.) to be seen. The two cases from trivia have been joined so that we have only one in composed. Unless somebody teaches GHC enough algebra to simplify \x -> (2*x-1) + (3*x+2) to \x -> 5*x + 1, that's as good as you can hope. apply composed is reduced to 6 at compile time, even in a separate module.

But that was very simple, let's give it a somewhat harder nut to crack.

An inlinable version of until (the current definition of until is recursive, so GHC doesn't inline it),

module WWUntil where

wwUntil :: (a -> Bool) -> (a -> a) -> a -> a
wwUntil p f = recur
  where
    recur x
        | p x       = x
        | otherwise = recur (f x)

Another simple function it its own module,

collatzStep :: Int -> Int
collatzStep n
    | n .&. 1 == 0  = n `unsafeShiftR` 1
    | otherwise     = 3*n + 1

and finally, the nut

module Hailstone (collatzLength, hailstone) where

import FuncAndValue
import CollatzStep
import WWUntil

data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !Int

fstP :: P -> Int
fstP (P x _) = x

sndP :: P -> Int
sndP (P _ y) = y

hailstone :: Int -> FuncAndValue Int Int
hailstone n = sndP `chain` wwUntil ((== 1) . fstP) (\(P n k) -> P (collatzStep n) (k+1))
                   `chain` FuncAndValue (\x -> P x 0) n

collatzLength :: Int -> Int
collatzLength = apply . hailstone

I have helped the strictness analyser a bit by using a strict pair. With the vanilla (,) the second component would be unboxed and reboxed after adding 1 in each step, and I just can't bear such waste ;) But otherwise there's no relevant difference.

And (the interesting part of) the core GHC generates:

Rec {
Hailstone.$wrecur [Occ=LoopBreaker]
  :: GHC.Prim.Int#
     -> GHC.Prim.Int# -> (# GHC.Prim.Int#, GHC.Prim.Int# #)
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LL]
Hailstone.$wrecur =
  \ (ww_sqq :: GHC.Prim.Int#) (ww1_sqr :: GHC.Prim.Int#) ->
    case ww_sqq of wild_Xm {
      __DEFAULT ->
        case GHC.Prim.word2Int#
               (GHC.Prim.and# (GHC.Prim.int2Word# wild_Xm) (__word 1))
        of _ {
          __DEFAULT ->
            Hailstone.$wrecur
              (GHC.Prim.+# (GHC.Prim.*# 3 wild_Xm) 1) (GHC.Prim.+# ww1_sqr 1);
          0 ->
            Hailstone.$wrecur
              (GHC.Prim.uncheckedIShiftRA# wild_Xm 1) (GHC.Prim.+# ww1_sqr 1)
        };
      1 -> (# 1, ww1_sqr #)
    }
end Rec }

lvl_rsz :: GHC.Types.Int -> GHC.Types.Int
[GblId, Arity=1, Caf=NoCafRefs]
lvl_rsz =
  \ (x_iog :: GHC.Types.Int) ->
    case x_iog of _ { GHC.Types.I# tpl1_B4 ->
    case Hailstone.$wrecur tpl1_B4 0 of _ { (# _, ww2_sqH #) ->
    GHC.Types.I# ww2_sqH
    }
    }

and that's exactly what you get without FuncAndValue. Everything inlined nicely, a beautiful tight loop.

Basically, does storing functions in data types in anyway inhibit optimizations.

If you wrap the function under enough layers, yes. But it's the same with other values.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431