Undoing a prefix sum

Question

The simple way to calculate a prefix sum in Haskell is

scanl1 (+) [1,2,3,4,5,6]

which produces the output

[1,3,6,10,15,21]

I needed to write the inverse, and this is what I came up with:

undo_prefix_sum :: (Num a) => [a] -> [a]
undo_prefix_sum s = init $ snd $ 
    foldr (\cur (tot, l) -> (cur, (tot-cur):l)) 
          (last s, []) 
          (0:s)

This seems to be correct (but I might have missed something). Is there a simpler or more efficient way to do this, possibly using a scan?

You mention `zipWith (-) xs (0:xs)` in another comment and "illuminating interesting parts of Haskell". If you haven't seen `zipWith (-) <*> (0 :) $ [1..6]` you might be up for some illuminating. — Micha Wiedenmann, Sep 16 '21 at 07:46
Thanks. My experience with this was much like the one I had with ["juxt" in haskell](https://coderwall.com/p/znfnjw/the-haskell-equivalent-of-clojure-s-juxt-is-sequence). In this case, the definition of `<*>` for functions uses its argument twice; in the other one, the Reader monad instance for functions is implicated, which has the similar effect of using the argument more than once! — Theo H, Sep 16 '21 at 09:52
`scanl1 (+)` does not calculate a prefix sum. it calculates prefix sums. — Will Ness, Sep 23 '21 at 10:23
@MichaWiedenmann after `scanl1 (+) [1..20]` in GHCI try `zipWith (-) <*> (0 :) $ it`. (wait for it...) and then try `zipWith (-) <*> (0 :) $ it` *again*. and again. and again. :) and again. :) I don't know about illuminating, but it's interesting. — Will Ness, Sep 24 '21 at 09:39
a sum is a number. a scan produces a list (here, of numbers). a scan is an unfolding, in unfolds step by step a list of partial results, here partial sums of a list. a folding produces one final result. — Will Ness, Sep 25 '21 at 07:33

chepner · Accepted Answer · 2021-09-16T01:27:41.347

5

In my opinion, as prefix_sum is naturally expressed as a fold, undo_prefix_sum is more naturally expressed as an unfold.

import Data.List

undo_prefix_sum  = unfoldr (\xs -> if null xs 
                                   then Nothing
                                   else let (h:t) = xs 
                                        in Just (h,  map (subtract h) t))

unfoldr uses a function to build a list starting from a seed value. In this case, the seed is a list itself; we use the first element of the seed in our result, and (a modified version of) the rest of the list as the seed to compute the rest of the result recursively.

edited Sep 16 '21 at 01:27

answered Sep 16 '21 at 01:09

chepner

497,756
71
530
681

2

That's exactly the kind of answer I was looking for. `undo_prefix_sum xs = zipWith (-) xs (0:xs)` works as well, but it doesn't illuminate any more interesting parts of Haskell in the way your answer does. – Theo H Sep 16 '21 at 04:07
the repeated `(subtract h)`s stack up, making this a quadratic code. the `zipWith (-) <*> (0:)` calculation is linear. so it's more efficient, and also simpler. – Will Ness Sep 23 '21 at 10:14
(the above was meant @TheoH) – Will Ness Sep 24 '21 at 09:31
@WillNess Is that an insurmountable obstacle? Wondering if there's a way to tweak this so that the stacks of subtract sections can be "fused" into single functions. I do like the symmetry of computing this as an unfold, but not at the cost of the (overlooked) quadratic behavior. – chepner Sep 24 '21 at 11:48
when we fuse them we end up with ... the `zipWith ... tail` version. (I've posted an answer with a few re-writes and also comment on this) – Will Ness Sep 24 '21 at 12:26
re unfold, what you mean is anamorphism. and actually foldr with a lazy combining function is just as capable of coding it. (or something that is close enough). but we'd need to first zip the list with its tail. or we could emulate *paramorphism*, by folding over the `tails` of the list. (that's what I meant about "commenting on" in my answer). and the prefix_sums (NB plural) calculation is not just catamorphism, since it's not strict. I'm not sure but I think catamorphisms are necessarily strict. – Will Ness Sep 24 '21 at 12:35
@WillNess It's clear that the `zipWith` approach avoids needing an accumulator to carry the "running total" to the next element. A more complex unfold could pass this total along, to be subtracted from the head of the input at each step. AFAICS this would avoid the quadratic behaviour caused by waves of subtractions, but the code would no longer be more elegant than `zipWith`. – Theo H Sep 24 '21 at 20:38
@WillNess As far as your insistence on plural "prefix sums" goes, it's obviously not a matter of grammatical logic but of accepted usage. Some people call the function `all-prefix-sums` but I had a good look, and since the 80s, the general problem has been known as "the prefix problem" and the domain as "prefix computation". There's also no ambiguity—what else could a function called "prefix sum" do? So the usage has moved on from a basic logical English expression to the notion of "prefix computation" as a term of art (and that idea doesn't refer to refer to `take n`! – Theo H Sep 24 '21 at 20:44
@TheoH I thought, "calculate a sum of a prefix", just from the _grammatics_ of it. :) as for the elegance, it's in the eyes of the beholder. for me the zipWith code is the most elegant. and there is no need to calculate any running total, its already there. it's the previous number. in fact, my insistence on the plural was to hint that the prefix-sums calculation is an _unfolding_, its step being the addition of the next element. thus its reverse is the removal of the previous one, which was the previous prefix's sum (the "total" you wanted to compute anew). – Will Ness Sep 25 '21 at 06:13
@TheoH so maybe if the accepted term of art is misleading, it's better ignored/corrected, not followed to the letter. :) – Will Ness Sep 25 '21 at 06:26

score 0 · Answer 2 · answered Sep 24 '21 at 12:26

Your function should have been written as

undo_prefix_sum' :: (Num a) => [a] -> [a]
undo_prefix_sum' xs = init $ snd $ 
    foldr (\x ~(y, ys) -> (x, (y-x):ys)) 
          (undefined, []) 
          (0:xs)

The tot in your code is not "total", it's simply the next value in the input after x. So some symmetry in the names is called for here (naming is important). The last xs bit is never calculated because of the init call so can just be undefined, to avoid giving the false impression.

Since all it does is to calculate the difference between each element and its predecessor (or 0 for the very first one), we might as well be clear about that:

foo xs = [y-x | i <- [0..length xs-1], let y=xs!!i ; x = (0:xs)!!i ]

which is of course quadratic, and it calculates the length which is a big no-no; but at least it's simple, so it's simple enough to fix:

bar xs = [y-x | x <- 0:xs | y <- xs]
       = map (uncurry (-)) $ zip xs (0:xs)
       = zipWith (-) xs (0:xs)

which you've already seen, as stated in the comments. So it really is equivalent to your code, computationally, unlike the code in the other answer which calculates the same result by means of a substantially different, essentially quadratic (and needlessly so), calculation.

If you really wanted to express the zipWith (-) process as an unfolding, it could be done as

baz xs = [y-x | (x:y:_) <- tails (0:xs)]

since tails ~= iterate tail, and iterating and unfolding are essentially the same thing (each can be expressed as the other, with some possibly needed pre- and/or post- processing steps).

One last thing. There indeed is a problem with your code. It is needlessly strict:

> take 2 $ undo_prefix_sum $ [1,2] ++ undefined
*** Exception: Prelude.undefined

> take 2 $ undo_prefix_sum' $ [1,2] ++ undefined
[1,1]

foo above is also similarly strict. All the other versions in this answer (and code in the other answer as well) successfully and correctly calculate the result. The only difference between the undo_prefix_sum' here and your code, except for the variables renaming, is a very small one.

Can you spot it?

With that out of the way, whether your code is nominally using foldr or unfoldr, the important thing is whether it is lazy enough. foldr with a lazy combining function is just as capable of coding that. Catamorphisms like sum are necessarily strict. To write your function as a proper anamorphism so it is still linear as it should be, we need to pre-process the input by zipping it with its tail since we need the two head elements at once. Another way to code this is with a paramorphism so that we'd have access to the input's current tail as well as its current element. And paramorphisms are emulated easily enough by folding over the tails (yes, like baz is doing).

Undoing a prefix sum

2 Answers2