8

I'm concerned with if and when a polymorphic "global" class value is shared/memoized, particularly across module boundaries. I have read this and this, but they don't quite seem to reflect my situation, and I'm seeing some different behavior from what one might expect from the answers.

Consider a class that exposes a value that can be expensive to compute:

{-# LANGUAGE FlexibleInstances, UndecidableInstances #-}

module A

import Debug.Trace

class Costly a where
  costly :: a

instance Num i => Costly i where
  -- an expensive (but non-recursive) computation
  costly = trace "costly!" $ (repeat 1) !! 10000000

foo :: Int
foo = costly + 1

costlyInt :: Int
costlyInt = costly

And a separate module:

module B
import A

bar :: Int
bar = costly + 2

main = do
  print foo
  print bar
  print costlyInt
  print costlyInt

Running main yields two separate evaluations of costly (as indicated by the trace): one for foo, and one for bar. I know that costlyInt just returns the (evaluated) costly from foo, because if I remove print foo from main then the first costlyInt becomes costly. (I can also cause costlyInt to perform a separate evaluation no matter what, by generalizing the type of foo to Num a => a.)

I think I know why this behavior happens: the instance of Costly is effectively a function that takes a Num dictionary and generates a Costly dictionary. So when compiling bar and resolving the reference to costly, ghc generates a fresh Costly dictionary, which has an expensive thunk in it. Question 1: am I correct about this?

There are a few ways to cause just one evaluation of costly, including:

  • Put everything in one module.
  • Remove the Num i instance constraint and just define a Costly Int instance.

Unfortunately, the analogs of these solutions are not feasible in my program -- I have several modules that use the class value in its polymorphic form, and only in the top-level source file are concrete types finally used.

There are also changes that don't reduce the number of evaluations, such as:

  • Using INLINE, INLINABLE, or NOINLINE on the costly definition in the instance. (I didn't expect this to work, but hey, worth a shot.)
  • Using a SPECIALIZE instance Costly Int pragma in the instance definition.

The latter is surprising to me -- I'd expected it to be essentially equivalent to the second item above that did work. That is, I thought it would generate a special Costly Int dictionary, which all of foo, bar, and costlyInt would share. My question 2: what am I missing here?

My final question: is there any relatively simple and foolproof way to get what I want, i.e., all references to costly of a particular concrete type being shared across modules? From what I've seen so far, I suspect the answer is no, but I'm still holding out hope.

Community
  • 1
  • 1
Chris Peikert
  • 563
  • 4
  • 8
  • How exactly is this question different from [my earlier one](http://stackoverflow.com/questions/25057803/is-there-an-automatic-way-to-memoise-global-polymorphic-values-in-haskell)? – leftaroundabout May 18 '15 at 21:17
  • From what I can tell, the costliness in your question comes from recursion combined with dictionary passing, which causes exponential reevaluation. Once the latter is fixed, the computation itself is fast. But if two modules referred to fibsImplicitDict, they'd get different thunks that would be evaluated separately. I can't afford that, because my computation is inherently expensive (not due to recursion). – Chris Peikert May 18 '15 at 21:40
  • PS -- the fact that there are class instances with constraints on them is also playing a role here, it seems. – Chris Peikert May 18 '15 at 21:47
  • Do you need to use a type class here? If it only has the one instance, you shouldn't need it. – David Young May 18 '15 at 22:34
  • In my real program, yes -- I have many instances. Costly represents a ring that has roots of unity, and I have instances for complex numbers, integers mod q, finite fields, etc. Their roots of unity aren't cheap enough to recompute all over the place. – Chris Peikert May 18 '15 at 22:39

1 Answers1

6

Controlling sharing is tricky in GHC. There are many optimizations that GHC does which can affect sharing (such as inlining, floating things out, etc).

In this case, to answer the question why the SPECIALIZE pragma did not achieve the intended effect, let's look at the Core of the B module, in particular of the bar function:

Rec {
bar_xs
bar_xs = : x1_r3lO bar_xs
end Rec }

bar1 = $w!! bar_xs 10000000
--     ^^^ this repeats the computation. bar_xs is just repeat 1

bar =
  case trace $fCostlyi2 bar1 of _ { I# x_aDm -> I# (+# x_aDm 2) }
  --         ^^^ this is just the "costly!" string

That didn't work as we wanted. Instead of reusing costly, GHC decided to just inline the costly function.

So we have to prevent GHC from inlining costly, or the computation will be duplicated. How do we do that? You might think adding a {-# NOINLINE costly #-} pragma would be enough, but unfortunately specialization without inlining don't seem to work together well:

A.hs:13:3: Warning:
    Ignoring useless SPECIALISE pragma for NOINLINE function: ‘$ccostly’

But there is a trick to convince GHC to do what we want: we can write costly in the following way:

instance Num i => Costly i where
  -- an expensive (but non-recursive) computation
  costly = memo where
    memo :: i
    memo = trace "costly!" $ (repeat 1) !! 10000000
    {-# NOINLINE memo #-}
  {-# SPECIALIZE instance Costly Int #-}
-- (this might require -XScopedTypeVariables)

This allows us to specialize costly, will simultanously avoiding the inlining of our computation.

bennofs
  • 11,873
  • 1
  • 38
  • 62
  • Outstanding! I had tried the memo trick, and NOINLINE with SPECIALIZE (only to get the same error), but hadn't thought to combine them in this way. This trick will work for some of my global values, but for others I can't SPECIALIZE because I only know the concrete types at the topmost application module. Is there any hope for that case? – Chris Peikert May 19 '15 at 18:33
  • @ChrisPeikert I haven't checked, but maybe inlining `costly` could achieve the same effect as specialization in this case? – bennofs May 19 '15 at 18:36
  • Right, inlining is the opposite of what I want (i.e., sharing). – Chris Peikert May 19 '15 at 18:37
  • @ChrisPeikert I don't think this is possible. If GHC doesn't know in advance which types the function should be memoized for, there is no way to make the appropriate "top-level memos" that store the memoized value. The only way this could work is if you inlined all functions using `costly` and then somehow arranged for all the memos to be floated out and eliminated by CSE. I doubt that this would work reliably though ... – bennofs May 19 '15 at 18:45
  • Or alternatively, get GHC to auto-specialize costly and CSE all the specializations of `memo` for the same type. – bennofs May 19 '15 at 18:46