20

lens offers holesOf, which is a somewhat more general and powerful version of this hypothetical function:

holesList :: Traversable t
          => t a -> [(a, a -> t a)]

Given a container, holesList produces a list of elements of the container along with functions for replacing those elements.

The type of holesList, like that of the real holesOf, fails to capture the fact that the number of pairs produced will equal the number of elements of the container. A much more beautiful type, therefore, would be

holes :: Traversable t
      => t a -> t (a, a -> t a)

We could implement holes by using holesList to make a list and then traversing in State to slurp the elements back in. But this is unsatisfactory for two reasons, one of which has practical consequences:

  1. The slurping code will have an unreachable error call to handle the case where the list runs empty before the traversal is complete. This is disgusting, but probably doesn't matter much to someone using the function.

  2. Containers that extend infinitely to the left, or that bottom out on the left, won't work at all. Containers that extend very far to the left will be very inefficient to handle.

I'm wondering if there's any way around these problems. It's quite possible to capture the shape of the traversal using something like Magma in lens:

data FT a r where
  Pure :: r -> FT a r
  Single :: a -> FT a a
  Map :: (r -> s) -> FT a r -> FT a s
  Ap :: FT a (r -> s) -> FT a r -> FT a s

instance Functor (FT a) where
  fmap = Map
instance Applicative (FT a) where
  pure = Pure
  (<*>) = Ap

runFT :: FT a t -> t
runFT (Pure t) = t
runFT (Single a) = a
runFT (Map f x) = f (runFT x)
runFT (Ap fs xs) = runFT fs (runFT xs)

Now we have

runFT . traverse Single = id

traverse Single makes a tree full of elements along with the function applications needed to build them into a container. If we replace an element in the tree, we can runFT the result to get a container with that element replaced. Unfortunately, I am stuck: I don't know what the next step might look like.


Vague thoughts: adding another type parameter might help change element types. The Magma type does something like this, and it goes back at least as far as Zemyla's comment on Van Laarhoven's blog post about FunList.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • 2
    Tangent. This reminds me of [`wigglesum :: Traversable t => (a -> [a]) -> (t a -> [t a])`](https://jaspervdj.be/posts/2012-10-17-wiggling-sums.html) that can be implemented using [`holesOf`](http://r6.ca/blog/20121209T182914Z.html): `wigglesum wiggle = holesOf traverse >=> experiment wiggle` – Iceland_jack Feb 23 '18 at 19:25
  • 1
    @Iceland_jack, [Russell O'Connor's response](http://r6.ca/blog/20121209T182914Z.html) to that blog post is what led me into this territory. Easily nerd-sniped, I submitted a now-merged [PR to lens](https://github.com/ekmett/lens/pull/789) removing the slurping from `holesof`. – dfeuer Feb 23 '18 at 20:07
  • Interesting commit! – Iceland_jack Feb 23 '18 at 23:07

4 Answers4

13

Your existing solution calls runMag once for every branch in the tree defined by Ap constructors.

I haven't profiled anything, but as runMag is itself recursive, this might slow things down in a large tree.

An alternative would be to tie the knot so you're only (in effect) calling runMag once for the entire tree:

data Mag a b c where
  One :: a -> Mag a b b
  Pure :: c -> Mag a b c
  Ap :: Mag a b (c -> d) -> Mag a b c -> Mag a b d

instance Functor (Mag a b) where
  fmap = Ap . Pure

instance Applicative (Mag a b) where
  pure = Pure
  (<*>) = Ap

holes :: forall t a. Traversable t => t a -> t (a, a -> t a)
holes = \t -> 
    let m :: Mag a b (t b)
        m = traverse One t 
    in fst $ go id m m
  where
    go :: (x -> y)
       -> Mag a (a, a -> y) z
       -> Mag a a x
       -> (z, x)
    go f (One a)    (One _)    = ((a, f), a)
    go _ (Pure z)   (Pure x)   = (z, x)
    go f (Ap mg mi) (Ap mh mj) = 
      let ~(g, h) = go (f . ($j)) mg mh
          ~(i, j) = go (f .   h ) mi mj
      in (g i, h j)
    go _ _ _ = error "only called with same value twice, constructors must match"
rampion
  • 87,131
  • 49
  • 199
  • 315
  • Very clever. I had the feeling there might be some fancy knot to be tied, but the types were already hard enough for me to navigate without trying to find that! – dfeuer Feb 27 '18 at 06:15
  • 1
    You can pretty much forget about profiling. It appears that your version gets a *ton* of sharing among the results, whereas mine does not. I don't know just how they compare in lazier circumstances, but yours can make an enormous `Map` of `Map`s without any difficulty, while mine cannot. – dfeuer Feb 27 '18 at 07:24
8

I have not managed to find a really beautiful way to do this. That might be because I'm not clever enough, but I suspect it is an inherent limitation of the type of traverse. But I have found a way that's only a little bit ugly! The key indeed seems to be the extra type argument that Magma uses, which gives us the freedom to build a framework expecting a certain element type and then fill in the elements later.

data Mag a b t where
  Pure :: t -> Mag a b t
  Map :: (x -> t) -> Mag a b x -> Mag a b t
  Ap :: Mag a b (t -> u) -> Mag a b t -> Mag a b u
  One :: a -> Mag a b b

instance Functor (Mag a b) where
  fmap = Map

instance Applicative (Mag a b) where
  pure = Pure
  (<*>) = Ap

-- We only ever call this with id, so the extra generality
-- may be silly.
runMag :: forall a b t. (a -> b) -> Mag a b t -> t
runMag f = go
  where
    go :: forall u. Mag a b u -> u
    go (Pure t) = t
    go (One a) = f a
    go (Map f x) = f (go x)
    go (Ap fs xs) = go fs (go xs)

We recursively descend a value of type Mag x (a, a -> t a) (t (a, a -> t a)) in parallel with one of type Mag a a (t a) using the latter to produce the a and a -> t a values and the former as a framework for building t (a, a -> t) from those values. x will actually be a; it's left polymorphic to make the "type tetris" a little less confusing.

-- Precondition: the arguments should actually be the same;
-- only their types will differ. This justifies the impossibility
-- of non-matching constructors.
smash :: forall a x t u.
         Mag x (a, a -> t) u
      -> Mag a a t
      -> u
smash = go id
  where
    go :: forall r b.
          (r -> t)
       -> Mag x (a, a -> t) b
       -> Mag a a r
       -> b
    go f (Pure x) _ = x
    go f (One x) (One y) = (y, f)
    go f (Map g x) (Map h y) = g (go (f . h) x y)
    go f (Ap fs xs) (Ap gs ys) =
      (go (f . ($ runMag id ys)) fs gs)
      (go (f . runMag id gs) xs ys)
    go _ _ _ = error "Impossible!"

We actually produce both Mag values (of different types!) using a single call to traverse. These two values will actually be represented by a single structure in memory.

holes :: forall t a. Traversable t => t a -> t (a, a -> t a)
holes t = smash mag mag
  where
    mag :: Mag a b (t b)
    mag = traverse One t

Now we can play with fun values like

holes (Reverse [1..])

where Reverse is from Data.Functor.Reverse.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • `go (One a) = f a` seems to me to unify `b` with `u`. – Gurkenglas Feb 27 '18 at 00:41
  • @Gurkenglas, the pattern match on `One` does that. But in other cases they won't be the same. Consider `Map Just (One x)`. – dfeuer Feb 27 '18 at 00:43
  • 1
    One suggestion to make it impossible to call `smash` incorrectly - change it to `smash :: (forall b. Mag a b (t b)) -> t (a, a -> t a); smash = \m -> go id m m` – rampion Feb 27 '18 at 01:47
  • @rampion, I wasn't sure what would be clearest. That's certainly a good approach. – dfeuer Feb 27 '18 at 01:51
7

Here is an implementation that is short, total (if you ignore the circularity), doesn't use any intermediate data structures, and is lazy (works on any kind of infinite traversable):

import Control.Applicative
import Data.Traversable

holes :: Traversable t => t a -> t (a, a -> t a)
holes t = flip runKA id $ for t $ \a ->
  KA $ \k ->
    let f a' = fst <$> k (a', f)
    in (a, f)

newtype KA r a = KA { runKA :: (a -> r) -> a }

instance Functor (KA r) where fmap f a = pure f <*> a
instance Applicative (KA r) where
  pure a = KA (\_ -> a)
  liftA2 f (KA ka) (KA kb) = KA $ \cr ->
    let
      a = ka ar
      b = kb br
      ar a' = cr $ f a' b
      br b' = cr $ f a b'
    in f a b

KA is a "lazy continuation applicative functor". If we replace it with the standard Cont monad, we also get a working solution, which is not lazy, however:

import Control.Monad.Cont
import Data.Traversable

holes :: Traversable t => t a -> t (a, a -> t a)
holes t = flip runCont id $ for t $ \a ->
  cont $ \k ->
    let f a' = fst <$> k (a', f)
    in k (a, f)
Roman Cheplyaka
  • 37,738
  • 7
  • 72
  • 121
  • Does that `fst <$>` risk a space leak? This code is so mind-bendy I can't tell. If I pluck an `a -> t a` out of the result, apply it to a value, and consume the result "from the top down", will the garbage collector be able to collect the top of the structure, or will it hang on to it through never-to-be-realized `a -> t a` values? Regardless, this is a beautiful construction. – dfeuer Feb 27 '18 at 17:17
  • Hrmm... Also, it seems this unfortunately doesn't get the magical sharing of @rampion's solution. I fear that may be the price it pays for avoiding the ugly double pattern matching. So I think your way is the most beautiful, but probably not one I'd choose in practice. – dfeuer Feb 27 '18 at 17:32
  • Do you think there's a way to fix the performance problem, perhaps at the expense of just *some* of what makes this solution more theoretically nice than rampion's? This weird continuation thing breaks my brain; is there somewhere I could read about it? – dfeuer Feb 27 '18 at 18:48
  • In my experiments, my solution runs on par or even somewhat better (faster and less memory) than rampion's. Try `print $ sum $ snd (holes [(0::Int)..limit] !! 1) 42`. – Roman Cheplyaka Feb 27 '18 at 19:10
  • 2
    As to whether you can read about it somewhere, I don't know; I just invented it, but I wouldn't be surprised if someone considered it before. I am writing a blog post about it which you'll be able to read soon though. – Roman Cheplyaka Feb 27 '18 at 19:33
  • 2
    Given a big `Map Int Int` (10000 elements), take the holes and `Strict.map (($ 100) . snd)` it, evaluating to WHNF. Your solution seems to blow up badly, while rampion's completes almost immediately with only modest allocation. – dfeuer Feb 27 '18 at 19:36
  • I suspect your solution works well for streamy things, but seems to be O(n^2) rather than O(n log n) for balanced trees with typical pre/post/inorder traversals. – dfeuer Feb 27 '18 at 19:45
  • Ok, now I see. The difference is not so much between list/map but between evaluating one hole or many. – Roman Cheplyaka Feb 27 '18 at 20:06
  • dfeuer: You [can get rid of the intermediate `fmap`s by using a recursive newtype](https://gist.github.com/rampion/bb4ad75a85c431f1c3e70ae29ce369ce). I still need to do some analysis to see what this does in terms of sharing work, so I'm not sure it's any performance improvement. – rampion Feb 28 '18 at 15:27
  • @rampion, I don't *think* that gives us the sharing we want, but you should check with the `Map` test I describe above. Be sure to use the strict mapping function rather than `fmap`. – dfeuer Feb 28 '18 at 23:21
  • Roman, I came up with an implementation of `holesList` that may or may not inspire you to find an improvement of your technique: https://gist.github.com/treeowl/789d43a641eff65083f724fc56d28234 This gets sharing without a sketchy `Applicative` instance, but it's a sad listy version. – dfeuer Mar 01 '18 at 05:12
1

This doesn't really answer the original question, but it shows another angle. It looks like this question is actually tied rather deeply to a previous question I asked. Suppose that Traversable had an additional method:

traverse2 :: Biapplicative f
           => (a -> f b c) -> t a -> f (t b) (t c)

Note: This method can actually be implemented legitimately for any concrete Traversable datatype. For oddities like

newtype T a = T (forall f b. Applicative f => (a -> f b) -> f (T b))

see the illegitimate ways in the answers to the linked question.

With that in place, we can design a type very similar to Roman's, but with a twist from rampion's:

newtype Holes t m x = Holes { runHoles :: (x -> t) -> (m, x) }

instance Bifunctor (Holes t) where
  bimap f g xs = Holes $ \xt ->
    let
      (qf, qv) = runHoles xs (xt . g)
    in (f qf, g qv)

instance Biapplicative (Holes t) where
  bipure x y = Holes $ \_ -> (x, y)
  fs <<*>> xs = Holes $ \xt ->
    let
      (pf, pv) = runHoles fs (\cd -> xt (cd qv))
      (qf, qv) = runHoles xs (\c -> xt (pv c))
    in (pf qf, pv qv)

Now everything is dead simple:

holedOne :: a -> Holes (t a) (a, a -> t a) a
holedOne x = Holes $ \xt -> ((x, xt), x)

holed :: Traversable t => t a -> t (a, a -> t a)
holed xs = fst (runHoles (traverse2 holedOne xs) id)
dfeuer
  • 48,079
  • 5
  • 63
  • 167