10

A Clojure programmer I know recently remarked that it's possible to implement lots of sequence functions in terms of Clojure's reduce (which is Haskell's foldl'), but that regrettably there's no way to implement (map list xs ys) (which is Haskell's zip) with just reduce.

Now, I've read about the universality of folds, so I'm pretty sure this isn't true: certainly zip is possible with foldr, and I'd be surprised if it weren't possible with foldl. For purposes of this discussion we're ignoring the problems of the eagerness of foldl compared to foldr: imagine we have unlimited memory and only work with finite sequences.

So, I took the Haskell code to implement zip with foldr, and translated it to Clojure, doing my best to adjust for the difference between foldr and foldl:

(defn zip [xs ys]
  (letfn [(done [_] [])
          (step [z x]
            (fn [ys]
              (if (empty? ys)
                []
                (conj (z (rest ys)) [x (first ys)]))))]
    ((reduce step done xs) ys)))
user> (zip [1 2] '[a b c]) ;=> [[1 b] [2 a]]

And indeed we do get pairs of elements from xs and ys tupled together, but out of order: the first element of xs is paired with the last element of ys, and so on down the line. I can kinda see the cause of the problem: the function we produce consumes ys starting from the left, but the outermost closure (which is called first) has closed over the last element of xs, so it can't pair them in the right order.

I think the fix is to somehow build the closures inside-out, but I can't really see how to do it. I would happily accept a solution in either Haskell or Clojure.

I'm hoping for a solution of the form zip = foldl f x, so that I can say it's "just" a reduce. Of course I can reverse one of the lists, but then it ends up looking like zip xs ys = foldl f x xs $ reverse ys which doesn't seem very satisfying or clean.

Community
  • 1
  • 1
amalloy
  • 89,153
  • 8
  • 140
  • 205
  • If you don't care at all about efficiency, you can use `init` instead of `tail` and/or `++[a]` instead of `a:`. – dfeuer Jan 24 '15 at 22:54
  • Alternatively, you can find an implementation of `foldr` in terms of `foldl` (for finite lists) on the Internet. – dfeuer Jan 24 '15 at 22:55
  • Well, I'm using `conj` in the Clojure version, which is an efficient version of `++ [a]` (using a different data structure than linked lists). That doesn't actually fix it: you can get either xs or ys in the right order, but not both in any obvious way that I can see. – amalloy Jan 24 '15 at 22:56
  • http://blog.wakatta.jp/blog/2011/11/11/haskell-foldr-as-foldl/ gives you all you need, I'm sure. – dfeuer Jan 24 '15 at 23:02
  • +1 for taking my question seriously the other day! glad to see this discussion continuing. i will be digesting all these answers and others i've found before i comment further – johnbakers Jan 26 '15 at 22:04

4 Answers4

4

In Haskell:

-- foldr f z xs == foldl (\r x a-> r (f x a)) id xs z

zip1 xs ys = -- foldr f (const []) xs ys
            foldl (\r x a-> r (f x a)) id xs (const []) ys
  where
    f x r []     = []
    f x r (y:ys) = (x,y) : r ys

Prelude> zip1 [1..9] [100..120]
[(1,100),(2,101),(3,102),(4,103),(5,104),(6,105),(7,106),(8,107),(9,108)]

Since Clojure likes appending at the end of list, another variant is

zip2 xs ys = foldl f (const []) xs ys
  where
    f r x [] = []
    f r x ys = let (ys0,y) = (init ys, last ys) -- use some efficient Clojure here
               in r ys0 ++ [(x,y)]

Prelude> zip2 [1..9] [100..120]
[(1,112),(2,113),(3,114),(4,115),(5,116),(6,117),(7,118),(8,119),(9,120)]

As you can see the end of lists lines up here, instead of the front.

Community
  • 1
  • 1
Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • 1
    Thanks! I have to say, it looks more readable in Haskell: https://www.refheap.com/37b61066156c2265969b7dd89 – amalloy Jan 26 '15 at 21:51
  • @amalloy, note that in this case, it's *not* just folding over the list. It's folding over the list to produce a function, and then applying the function. – dfeuer Jan 27 '15 at 00:28
3

Another Clojure expert pointed out that it's easier to do this in Clojure without trying to use the same structure as Haskell's foldr solution:

(defn zip [xs ys]
  (first (reduce
          (fn [[acc rest :as state] itm]
            (if rest
              [(conj acc [itm (first rest)])
               (next rest)]
              state))
          [[] (seq ys)]
          xs)))

This just folds over a pair, the first of which is a result sequence being built up, and the second is what remains of ys; xs are being fed in one item at a time via the reduce. In Haskell this would look like:

zip' :: [a] -> [b] -> [(a,b)]
zip' xs ys = fst $ foldl f ([], ys) xs
  where f state@(_, []) _x = state
        f (acc, (y:ys)) x = ((acc ++ [(x, y)]), ys)

except that of course the acc ++ [(x, y)] is much more reasonable in Clojure than in Haskell, since it's efficient.

amalloy
  • 89,153
  • 8
  • 140
  • 205
2

Simply implement

reverse = foldl (flip (:)) []

and apply it to the second list.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • 1
    I was hoping for a solution of the form `zip = foldl f x`, so that I can say it's "just" a reduce. Of course I can reverse one of the lists, but then it ends up looking like `zip xs ys = foldl f x xs $ reverse ys`, which is less satisfying. I'll add that to the question. – amalloy Jan 25 '15 at 00:14
  • 3
    @amalloy I think you're taking the claim "`foldl` is universal" to be stronger than it really is. It is common to compute "extra" data in your fold and throw some of it away or otherwise do some post-processing; e.g. the natural way to express the function which picks out every other element of a list as a fold is to produce both the even elements and the odd elements as a tuple. The claim is not "you always need exactly one `foldl` and nothing else", but rather, "the only thing you need to do to *lists* is `foldl`", which property is still true in my proposed solution. – Daniel Wagner Jan 25 '15 at 00:35
  • 2
    Also common: using the fold to produce a function, and then applying the function, which is how `foldl` and `foldr` are defined in terms of each other. – dfeuer Jan 25 '15 at 00:38
1

These two solutions are both of the form (-> (reduce f init x) something) rather than just (reduce f init x). Both carry along a container with the accumulated sequence and a some extra state, then extract the sequence from the container. In one case the container is a closure, in the other a vector.

If you want "just" (reduce f init x) then recognize that you already have all the state you need in the accumulated sequence itself -- its length.

(defn zip* [xs ys] 
  (let [xs (vec xs)
        f (fn [a y]
            (if (contains? xs (count a))
              (conj a [(xs (count a)) y])
              a))]
    (reduce f [] ys)))
Community
  • 1
  • 1
A. Webb
  • 26,227
  • 1
  • 63
  • 95