20

The Applicative instance for Data.Sequence generally performs very well. Almost all the methods are incrementally asymptotically optimal in time and space. That is, given fully forced/realized inputs, it's possible to access any portion of the result in asymptotically optimal time and memory residency. There is one remaining exception: (<*). I only know two ways to implement it as yet:

  1. The default implementation

     xs <* ys = liftA2 const xs ys
    

    This implementation takes O(|xs| * |ys|) time and space to fully realize the result, but only O(log(min(k, |xs|*|ys|-k))) to access just the kth element of the result.

  2. A "monadic" implementation

     xs <* ys = xs >>= replicate (length ys)
    

    This takes only O(|xs| * log |ys|) time and space, but it's not incremental; accessing an arbitrary element of the result requires O(|xs| * log |ys|) time and space.

I have long believed that it should be possible to have our cake and eat it too, but I've never been able to juggle the pieces in my mind well enough to get there. To do so appears to require a combination of ideas (but not actual code) from the implementations of liftA2 and replicate. How can this be done?


Note: it surely won't be necessary to incorporate anything like the rigidify mechanism of liftA2. The replicate-like pieces should surely produce only the sorts of "rigid" structures we use rigidify to get from user-supplied trees.

Update (April 6, 2020)

Mission accomplished! I managed to find a way to do it. Unfortunately, it's a little too complicated for me to understand everything going on, and the code is ... rather opaque. I will upvote and accept a good explanation of what I've written, and will also happily accept suggestions for clarity improvements and comments on GitHub.

Update 2

Many thanks to Li-Yao Xia and Bertram Felgenhauer for helping to clean up and document my draft code. It's now considerably less difficult to understand, and will appear in the next version of containers. It would still be nice to get an answer to close out this question.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • Imo it would help everyone to explain a bit how you get those bounds, perhaps even a diagram of what the different implementations produce? Otherwise everyone who would like to think about the problem has to independently do the work to get to where you're starting from. – moonGoose Mar 10 '20 at 19:08
  • Is `join $ replicate (length ys) <$> xs` the same as 2? – moonGoose Mar 10 '20 at 19:48
  • @moonGoose, yes, it is. – dfeuer Mar 10 '20 at 21:26
  • 1
    @moonGoose, I hear you about trying to give more details, but I don't think it will help. This is most definitely not going to be a quick-and-easy question for *anybody*. Answering it will require a deep dive into `Data.Sequence` internals; explaining the performance analysis doesn't seem likely to reduce the amount of work or effort involved. – dfeuer Mar 10 '20 at 21:29
  • @moonGoose, by way of background, I'm the one who wrote the `liftA2` implementation, and there are still parts of it that I don't understand myself. `replicate` was one of several extremely clever functions Louis Wasserman wrote; it's simpler than his `tails` but still pretty tricky. – dfeuer Mar 10 '20 at 21:37
  • Aha! I think I've found a way to manage the complexity at a (very small) efficiency cost! The trick will be a stripped-down version of the type of rigid trees that, instead of having digits, just has numbers indicating how big those digits are. This is all the structural information we need, because replication can (I'm pretty sure) still work in redundant base 3! So we just need a function (or class method) to flesh out those rigid tree bits, and a slightly modified `aptyMiddle` to put it all together. – dfeuer Mar 11 '20 at 22:03
  • 1
    @dfeuer I think I've reduced the problem to constructing a function whose type would look like this with dependent types: `repeat :: Int -> a -> (exists (n :: Nat). Node^n a)` (this existential is equivalent to what you get from throwing away the `Digits` fields in the definition of `FingerTree`), which I find awfully non-obvious to implement. – Li-yao Xia Mar 17 '20 at 14:32
  • @Li-yaoXia, something very similar goes on in the `aptyMiddle` function. The tricky bit here is being sure that the function constructed (when applying the size to your hypothetical version) creates a result with the required internal sharing. – dfeuer Mar 17 '20 at 14:56

0 Answers0