5

I find this documentation in the basic Haskell libraries:

zip :: [a] -> [b] -> [(a, b)]
    zip takes two lists and returns a list of corresponding pairs. If one input list is short, excess elements of the longer list are discarded.

zip3 :: [a] -> [b] -> [c] -> [(a, b, c)]
    zip3 takes three lists and returns a list of triples, analogous to zip.

zip4 :: [a] -> [b] -> [c] -> [d] -> [(a, b, c, d)]
    The zip4 function takes four lists and returns a list of quadruples, analogous to zip.

[...snip...]

unzip :: [(a, b)] -> ([a], [b])
    unzip transforms a list of pairs into a list of first components and a list of second components.

unzip3 :: [(a, b, c)] -> ([a], [b], [c])
    The unzip3 function takes a list of triples and returns three lists, analogous to unzip.

unzip4 :: [(a, b, c, d)] -> ([a], [b], [c], [d])
    The unzip4 function takes a list of quadruples and returns four lists, analogous to unzip.

... and so on, up to zip7 and unzip7.

Is this a fundamental limitation of Haskell's type system? Or is there a way to implement zip and unzip once, to work on different configurations of input?

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
Josh
  • 2,039
  • 3
  • 20
  • 25
  • 2
    The problem here is the tuples. The Haskell tuples are defined to be distinct types that have nothing in common (type-wise), and that makes a generic zipN impossible. – augustss Oct 12 '16 at 09:09
  • Related earlier question: http://stackoverflow.com/questions/2468226/how-to-zip-multiple-lists-in-haskell – danidiaz Oct 12 '16 at 11:03

4 Answers4

7

This is one very useful aspect of applicatives. Check out ZipList which is just a newtype wrapper around a simple list. The reason for the wrapper is that ZipList has an applicative instance for, you guessed it, zipping lists together. Then, if you want zip7 as bs cs ds es fs gs hs, you can just do something like

(,,,,,,) <$> as <*> bs <*> cs <*> ds <*> es <*> fs <*> gs <*> hs

As you can tell, this mechanism is meant to be also for extending zipWith, which is a general case of zip. To be honest, I think we should rip out all of the zipN functions and teach people the above instead. zip itself is fine, but beyond that...

Template Haskell solution

As the comments and other answers indicate, this is not a particularly satisfying answer. The one thing I was expecting someone else to implement was a TemplateHaskell version of zip and unzip. As no one has done so yet, here it is.

All it does is mechanically produce AST for zip or unzip functions. The idea behind zip is to use ZipList and behind unzip is to use foldr:

zip as ... zs === \as ... zs -> getZipList $ (, ... ,) <$> ZipList as <*> ... <*> ZipList zs
unzip         === foldr (\ (a, ... ,z) ~(as, ... ,zs) -> (a:as, ... ,z:zs) ) ([], ... ,[])

The implementation looks like this.

{-# LANGUAGE TemplateHaskell #-}
module Zip (zip, unzip) where

import Prelude hiding (zip, unzip)
import Language.Haskell.TH
import Control.Monad
import Control.Applicative (ZipList(..))

-- | Given number, produces the `zip` function of corresponding arity
zip :: Int -> Q Exp
zip n = do
  lists <- replicateM n (newName "xs")

  lamE (varP <$> lists)
       [| getZipList $
            $(foldl (\a b -> [| $a <*> ZipList $(varE b) |])
                    [| pure $(conE (tupleDataName n)) |]
                    lists) |]

-- | Given number, produces the `unzip` function of corresponding arity
unzip :: Int -> Q Exp
unzip n = do
  heads <- replicateM n (newName "x")
  tails <- replicateM n (newName "xs")

  [| foldr (\ $(tupP (varP <$> heads)) ~ $(tupP (varP <$> tails)) -> 
                $(tupE (zipWith (\x xs -> [| $x : $xs |])
                                (varE <$> heads)
                                (varE <$> tails))))
           $(tupE (replicate n [| [] |])) |]

You can try this at GHCi:

ghci> :set -XTemplateHaskell
ghci> $(zip 3) [1..10] "abcd" [4,6..]
[(1,'a',4),(2,'b',6),(3,'c',8),(4,'d',10)]
ghci> $(unzip 3) [(1,'a',4),(2,'b',6),(3,'c',8),(4,'d',10)]
([1,2,3,4],"abcd",[4,6,8,10])
Alec
  • 31,829
  • 7
  • 67
  • 114
  • This doesn't generalize the various `zipN` functions; it just replaces them with the corresponding hard-coded tuple constructor. (`zip` with `(,)`, `zip2` with `(,,)`, etc. – chepner Oct 12 '16 at 11:25
  • @chepner I think the OP is really asking for ways of zipping together lists once the builtin `zip1` through to `zip7` solutions fail. Given that machinery like `ZipList` exists in vanilla Haskell for this very purpose, I think it would not be a good idea to instead recommend something that needs have a dozen extensions and is bogged down by all sorts of mandatory annotations. – Alec Oct 12 '16 at 14:53
  • `ZipList` is about how you apply a list of functions to a list of inputs, not how you apply a single function of arbitrary arity to the necessary number of lists. – chepner Oct 12 '16 at 15:10
  • @chepner Again, avoiding the arbitrary arity that is difficult to navigate, I think this is as close as you get. Even the documentation describes it as a way of doing generalized `zipWith`: [`f <$> ZipList xs1 <*> ... <*> ZipList xsn = ZipList (zipWithn f xs1 ... xsn)`](http://hackage.haskell.org/package/base-4.9.0.0/docs/Control-Applicative.html#t:ZipList). That said, I realize this isn't a true arbitrary arity function. I even say that in the answer. – Alec Oct 12 '16 at 15:20
3

This is a zipN function that depends on the machinery of the generics-sop package:

{-# language TypeFamilies #-}
{-# language DataKinds #-}
{-# language TypeApplications #-}

import Control.Applicative
import Generics.SOP

-- "a" is some single-constructor product type, like some form of n-ary tuple
-- "xs" is a type-level list of the types of the elements of "a"
zipN :: (Generic a, Code a ~ '[ xs ]) => NP [] xs -> [a]
zipN np = to . SOP . Z <$> getZipList (hsequence (hliftA ZipList np))

main :: IO ()
main = do
   let zipped = zipN @(_,_,_) ([1,2,3,4,5,6] :* ['a','b','c'] :* [True,False] :* Nil)
   print $ zipped

The result:

[(1,'a',True),(2,'b',False)]

This solution has two disadvantages:

  • You have to wrap the argument lists in the special NP type from generics-sop that is constructed with :* and Nil.
  • You need to specify somehow that the result value is a list of tuples, and not a list of some other Generic-compatible type. Here, it is done with the @(_,_,_) type application.
danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • You've just replace, e.g., `zip3` with `zip @(_,_,_)`. It's still not general, you've just changed where the hard-coded tuple size that the function works with to an argument. – chepner Oct 12 '16 at 11:27
  • 2
    @chepner It's not easier from the point of view of the user, but it does reduce the need of duplicated code in the implementation. Also, the user might specify the type when it uses the value, by pattern-matching for example. – danidiaz Oct 12 '16 at 11:31
  • That's fair enough. I do think this is probably the closest one can get to the OP's intention. – chepner Oct 12 '16 at 11:39
1

2-ary, 3-ary.. n-ary tuples are all distinct data types, so you can't handle them uniformly directly, but you can introduce a type class that provides an interface that allows to define generic zip and unzip. Here is how it looks for generic unzip:

class Tuple t where
  type Map (f :: * -> *) t

  nilMap   :: Proxy t -> (forall a. f a) -> Map f t
  consMap  :: (forall a. a -> f a -> f a) -> t -> Map f t -> Map f t

Map maps all types in a tuple type with f. nilMap constructs a Mapped tuple that contains empty values (I have no idea why Haskell requires that Proxy t there). consMap receives a function, a tuple and a Mapped tuple and zip the tuples with the function pointwise. Here is how instances look for 2- and 3-tuples:

instance Tuple (a, b) where
  type Map f (a, b) = (f a, f b)

  nilMap _ a = (a, a)
  consMap f (x, y) (a, b) = (f x a, f y b)

instance Tuple (a, b, c) where
  type Map f (a, b, c) = (f a, f b, f c)

  nilMap _ a = (a, a, a)
  consMap f (x, y, z) (a, b, c) = (f x a, f y b, f z c)

The gunzip itself:

gunzip :: forall t. Tuple t => [t] -> Map [] t
gunzip  []    = nilMap (Proxy :: Proxy t) []
gunzip (p:ps) = consMap (:) p (gunzip ps)

This looks a lot like transpose:

transpose :: [[a]] -> [[a]]
transpose  []      = repeat [] -- `gunzip` handles this case better
transpose (xs:xss) = zipWith (:) xs (transpose xss)

which it basically is, except with tuples. gunzip can be equivalently defined in terms of foldr as follows:

gunzip :: forall t. Tuple t => [t] -> Map [] t
gunzip = foldr (consMap (:)) $ nilMap (Proxy :: Proxy t) []

To define generic zip we need a type class of splittable data types (is there something like this on Hackage?).

class Splittable f g where
  split :: f a -> g a (f a)

E.g. for lists we have

newtype MaybeBoth a b = MaybeBoth { getMaybeBoth :: Maybe (a, b) }

instance Splittable [] MaybeBoth where
  split  []    = MaybeBoth  Nothing
  split (x:xs) = MaybeBoth (Just (x, xs))

And here is what we add to the Tuple type class:

splitMap :: (Biapplicative g, Splittable f g) => Proxy (f t) -> Map f t -> g t (Map f t)

The Biapplicative g constraint ensures that it's possible to combine g a b and g c d into g (a, c) (b, d). For 2- and 3- tuples it looks like this:

splitMap _ (a, b) = biliftA2 (,) (,) (split a) (split b)

splitMap _ (a, b, c) = biliftA3 (,,) (,,) (split a) (split b) (split c)

After providing a Biapplicative instance for MaybeBoth

instance Biapplicative MaybeBoth where
  bipure x y = MaybeBoth $ Just (x, y)
  MaybeBoth f <<*>> MaybeBoth a = MaybeBoth $ uncurry (***) <$> f <*> a

we can finally define gzip:

gzip :: forall t. Tuple t => Map [] t -> [t]
gzip a = maybe [] (\(p, a') -> p : gzip a') . getMaybeBoth $ splitMap (Proxy :: Proxy [t]) a

It repeteadly cuts first elements of lists in a tuple, forms a tuple from them and prepends it to the result.

It should be possible to generalize gunzip by adding a dual to Splittable (Uniteable or something like that), but I'll stop here.

EDIT: I couldn't stop.

effectfully
  • 12,325
  • 2
  • 17
  • 40
0

You are right that these functions (zip2, zip3 etc.) are all instances of the same pattern and in an ideal world, they should be implementable generically. By the way, as an exercise to the reader, figure out what zip1 and zip0 should be ;).

However, it is hard to implement zipN generically, because the common pattern between all the different cases is rather non-trivial. This does not mean it's impossible to implement it generically, but you'll need some of the more advanced type system features of Haskell GHC to do it.

To be more concrete, zip2, zip3 etc. all have a different number of arguments, making this an instance of "arity-generic programming" (the arity of a function is its number of arguments). As you might expect in the world of functional programming, there is an interesting research paper that covers precisely this topic ("arity-generic programming"), and conveniently, one of their main examples is... zipWithN. It doesn't directly answer your question because it uses Agda rather than Haskell, but you might still find it interesting. In any case, similar ideas can be implemented in terms of one or more of Haskell's GHC's more advanced type-system features (TypeFamilies and DataKinds come to mind). PDF version here.

By the way, this is just about an arity-generic zipWithN. For an arity-generic zipN, you probably need some support from the compiler, particularly an arity-generic interface to the tuple constructor, which I suspect might not be in GHC. This is what I believe augustss's comment to the question and chepner's comment to Alec's answer refer to.

Dominique Devriese
  • 2,998
  • 1
  • 15
  • 21
  • Doing `zipN` is considerably tougher than `zipWithN`, since (and I think this is the gist of augustss' comment) tuples types are independent, unlike function types which are defined recursively. – Alec Oct 12 '16 at 17:53
  • Yes, that's what my final paragraph is about. It seems to me that the compiler could provide some help there, for example by exposing a primitive type-level function `mkTupleType` of kind `(n : Nat) -> (tys : Vec * n) -> NFun tys *`. – Dominique Devriese Oct 12 '16 at 18:46
  • Sorry, to correct myself, what I mean is the combination of a type-level primitive like `mkTupleType :: (n : Nat) -> Vec * n -> *` and a value-level primitive like `mkTuple :: forall (n : Nat) (tys : Vec * n). NFun tys (mkTuple n tys)`. Here, NFun is a hypothetical type-level function that constructs a function type from a vector of argument types and a result type (which probably exists in some form in a few generics libraries). – Dominique Devriese Oct 13 '16 at 18:40
  • "zip2, zip3 etc. all have a different number of arguments, making this an instance of arity-generic programming..." not necessarily. The function could just as easily be written to take a single argument, which is a list of lists, as the "unzip" functions do this with their return value. – Josh Oct 17 '16 at 04:24
  • You can do that, of course, but then you're giving up some type safety. As I understand this question, it is about whether you can do zipn without giving up type safety. – Dominique Devriese Oct 18 '16 at 19:18