12

ZipList comes with a Functor and an Applicative instance (Control.Applicative) but why not Alternative?

  • Is there no good instance?
  • What about the one proposed below?
    • Is it flawed?
    • is it useless?
    • Are there other reasonable possibilities (like Bool can be a monoid in two ways) and therefore neither should be the instance?

I searched for "instance Alternative ZipList" (with the quotes to find code first) and only found the library, some tutorials, lecture notes yet no actual instance.

Matt Fenwick said ZipList A will only be a monoid if A is (see here). Lists are monoids though, regardless of the element type.

This other answer by AndrewC to the same question discusses how an Alternative instance might look like. He says

There are two sensible choices for Zip [1,3,4] <|> Zip [10,20,30,40]:

  1. Zip [1,3,4] because it's first - consistent with Maybe
  2. Zip [10,20,30,40] because it's longest - consistent with Zip [] being discarded

where Zip is basically ZipList.

I think the answer should be Zip [1,3,4,40]. Let's see the instance:

instance Aternative Zip where
  empty = Zip []
  Zip xs <|> Zip ys = Zip (go xs ys) where
    go []     ys = ys
    go (x:xs) ys = x : go xs (drop 1 ys)

The only Zip a we can produce without knowing the type argument a is Zip [] :: Zip a, so there is little choice for empty. If the empty list is the neutral element of the monoid, we might be tempted to use list concatenation. However, go is not (++) because of the drop 1. Every time we use one entry of the first argument list, we drop one off the second as well. Thus we have a kind of overlay: The left argument list hides the beginning of the right one (or all of it).

[ 1, 3, 4,40]   [10,20,30,40]   [ 1, 3, 4]   [ 1, 3, 4]
  ^  ^  ^  ^      ^  ^  ^  ^      ^  ^  ^      ^  ^  ^
  |  |  |  |      |  |  |  |      |  |  |      |  |  |
[ 1, 3, 4] |    [10,20,30,40]   []|  |  |    [ 1, 3, 4]
[10,20,30,40]   [ 1, 3, 4]      [ 1, 3, 4]   []

One intuition behind ziplists is processes: A finite or infinite stream of results. When zipping, we combine streams, which is reflected by the Applicative instance. When the end of the list is reached, the stream doesn't produce further elements. This is where the Alternative instance comes in handy: we can name a concurrent replacement (alternative, really), taking over as soon as the default process terminates.

For example we could write fmap Just foo <|> pure Nothing to wrap every element of the ziplist foo into a Just and continue with Nothing afterwards. The resulting ziplist is infinite, reverting to a default value after all (real) values have been used up. This could of course be done by hand, by appending an infinite list inside the Zip constructor. Yet the above is more elegant and does not assume knowledge of constructors, leading to higher code reusability.

We don't need any assumption on the element type (like being a monoid itself). At the same time the definition is not trivial (as (<|>) = const would be). It makes use of the list structure by pattern matching on the first argument.

The definition of <|> given above is associative and the empty list really is the empty element. We have

Zip [] <*> xs  ==  fs <*> Zip []  ==  Zip []     -- 0*x = x*0 = 0
Zip [] <|> xs  ==  xs <|> Zip []  ==  xs         -- 0+x = x+0 = x
(fs <|> gs) <*> xs  ==  fs <*> xs <|> gs <*> xs
 fs <*> (xs <|> ys) ==  fs <*> xs <|> fs <*> ys

so all the laws you could ask for are satisfied (which is not true for list concatenation).

This instance is consistent with the one for Maybe: choice is biased to the left, yet when the left argument is unable to produce a value, the right argument takes over. The functions

zipToMaybe :: Zip a -> Maybe a
zipToMaybe (Zip [])    = Nothing
zipToMaybe (Zip (x:_)) = Just x

maybeToZip :: Maybe a -> Zip a
maybeToZip Nothing  = Zip []
maybeToZip (Just x) = Zip (repeat x)

are morphisms of alternatives (meaning psi x <|> psi y = psi (x <|> y) and psi x <*> psi y = psi (x <*> y)).

Edit: For the some/many methods I'd guess

some (Zip z) = Zip (map repeat z)
many (Zip z) = Zip (map repeat z ++ repeat [])
duplode
  • 33,731
  • 7
  • 79
  • 150
sgm
  • 143
  • 6
  • This isn't really a question as much as a proposal, correct? It'd be better to send this over the mailing list, perhaps. – J. Abrahamson Aug 13 '13 at 15:12
  • related question: http://stackoverflow.com/q/19449995/849891. – Will Ness Oct 18 '13 at 14:01
  • 1
    congrats! [it seems](https://www.reddit.com/r/haskell/comments/6vn968/compose_conference_choose_your_own_derivative/dm5tv1w/) this is now in. – Will Ness Sep 06 '17 at 19:05
  • indeed it's [in the base now](https://hackage.haskell.org/package/base-4.12.0.0/docs/Control-Applicative.html#t:ZipList) (since 4.11.0.0). btw another possible implementation is `ZL xs <|> ZL ys = ZL (map head . transpose $ [xs,ys])`. `transpose` is like "looking from above" at the lists stretched horizontally, one on top the other, just as you show. :) – Will Ness Apr 02 '20 at 08:28

4 Answers4

5

Tags / Indeces

Interesting. A not completely unrelated thought: ZipLists can be seen as ordinary lists with elements tagged by their (increasing) position index in the list. Zipping application joins two lists by pairing equally-indexed elements.

Imagine lists with elements tagged by (non-decreasing) Ord values. Zippery application would pair-up equally-tagged elements, throwing away all mismatches (it has its uses); zippery alternative could perform order-preserving left-preferring union on tag values (alternative on regular lists is also kind of a union).

This fully agrees with what you propose for indexed lists (aka ZipLists).

So yes, it makes sense.

Streams

One interpretation of a list of values is non-determinacy, which is consistent with the monad instance for lists, but ZipLists can be interpreted as synchronous streams of values which are combined in sequence.

With this stream interpretation it's you don't think in terms of the whole list, so choosing the longest stream is clearly cheating, and the correct interpretation of failing over from the first ZipList to the second in the definition <|> would be to do so on the fly as the first finishes, as you say in your instance.

Zipping two lists together doesn't do this simply because of the type signature, but it's the correct interpretation of <|>.

Longest Possible List

When you zip two lists together, the result is the minimum of the two lengths. This is because that's the longest possible list that meets the type signature without using ⊥. It's a mistake to think of this as picking the shorter of the two lengths - it's the longest possible.

Similarly <|> should generate the longest possible list, and it should prefer the left list. Clearly it should take the whole of the left list and take up the right list where the left left off to preserve synchronisation/zippiness.

AndrewC
  • 32,300
  • 7
  • 79
  • 115
Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • 1
    In other words: Zippery stuff is like functions with domains (here: intervals [0 .. k] or [0 ..] as subsets of the natural numbers). When applying functions to each other, the domains have to be intersected for all participants to be defined (here: the shortest interval). The other operation is left-biased overlay and domains are combined by set union (here: the longer interval). – sgm Aug 13 '13 at 17:12
  • @sgm yes! discrete function { (f x,x) } applied on {ys} obviously applies only on ys in its domain. Alternative is not application obviously, it's more like combining the possibilities for future application, maybe. So `fs <*> (xs <|> ys)` makes sense, as well as `(fs <|> gs) <*> xs`. Very interesting. :) – Will Ness Aug 13 '13 at 17:20
  • @Will I fancy editing your answer to include all the most compelling bits out of what you and sgm have said, but it might be a rather large change, so I thought I'd ask permission first. Would that be OK? I promise to be supportive of your viewpoint. It's so I'm happy to upvote. – AndrewC Aug 26 '13 at 19:11
  • @AndrewC thanks for the edit. So we actually have two ways to interpret lists: as unordered (asynchronous?) collections (`[]`), representing non-determinacy; and as ordered (synchronous) streams (`ZipList`), with order coming from indices/"time"_tags and not from elements which are still free to *not* belong to `Ord`. "time" can also be seen as discrete ordered domains, as sgm proposes. So we either combine/pair-up only simultaneous events and discard the unpairable ones, or we join/mash events streams and keep only one of the simultaneous events. And apply on *that* a time-dependent function. – Will Ness Aug 28 '13 at 06:40
  • @AndrewC this highlights the difference between non-decreasing and increasing lists. The latter suggests the discarding behaviour; the former suggests yet another possile behaviour where on join/mash all simultaneous events are preserved (the classic merge vs. union distinction). Coming back to regular ZipLists, their tags the position indices are inherently each unique, which corresponds to increasing order. – Will Ness Aug 28 '13 at 06:44
  • @WillNess I understand now more fully with these two comments, yes. For the non-decreasing, you don't need to discard anything when mashing. – AndrewC Aug 29 '13 at 22:32
2

Your instance is OK, but it does something ZipList doesn't by
(a) aiming for the longest list, and
(b) mixing elements between source lists.

Zipping as an operation stops at the length of the shortest list.

That's why I concluded in my answer:

Thus the only sensible Alternative instance is:

instance Alternative Zip where
   empty = Zip []
   Zip [] <|> x = x
   Zip xs <|> _ = Zip xs

This is consistent with the Alternative instances for Maybe and parsers that say you should do a if it doesn't fail and go with b if it does. You can say a shorter list is less successful than a longer one, but I don't think you can say a non-empty list is a complete fail.

empty = Zip [] is chosen because it has to be polymorphic in the element type of the list, and the only such list is []

For balance, I don't think your instance is terrible, I think this is cleaner, but hey ho, roll your own as you need it!

AndrewC
  • 32,300
  • 7
  • 79
  • 115
  • Did the *vague* arguments in my answer and comments below it not convince you? :) I thought that outlook made a lot of sense. The `Maybe` Applicative instance aligns with this paradigm as well: it pairs up a (presumed total) function with a domain value (Nothing meaning its absence). Since `Maybe` is seen here as domain value supplier for total functions, there can obviously be only one of them. That's the left-biased union of possible domains then. The real distinction is, whether for (generalization of) type `:: [(a,Int)]` we can see `( _ , Int)` as part of carrier structure for `a`. (contd) – Will Ness Aug 15 '13 at 08:31
  • And `ZipList`'s answer is : yes. So we can perhaps see `ZipList` as a narrowing of more general type `:: [(a,b)]` where `b` is seen as part of structure, not value, and mismatches on `b` are possible: `newtype (Ord b => HZip b a) = HZip [(a,b)]`, `instance Applicative (HZip b) where ...`? (don't know whether it is valid Haskell). And `ZipList` is then synonym to `HZip Int`. -- I think both your "sensible choices for Zip" aren't so: if <*> is product, shouldn't <|> be sum? So left-biased union seems much more sensible. `Maybe` carries no more than 1 element, that's just causes some conflation. – Will Ness Aug 15 '13 at 09:18
  • it (`Maybe`) doesn't discard "all" elements on the right; it discards "each" extra element on the right. – Will Ness Aug 15 '13 at 09:20
  • @WillNess I smiled at the last comment. It made me chuckle. Your argument makes sense, and I don't think it's wrong. I just think this way is more consistent - when you use ZipList on lists of differing size, the extra elements in the longer list are completely unavailable to you. To make that work differently in the Applicative instance itself seems at the very least very hard. It feels wrong then somehow and not zipish to extend with elements from the other list; it exploits a type match that's not normally there. It's like if `parser1` fails after eating 4 characters `parser2` takes over. – AndrewC Aug 15 '13 at 12:57
  • (`<*>` isn’t a product operation, it’s an application operation. `<|>` is _usually_ choice, not sum.) – AndrewC Aug 15 '13 at 13:00
  • @WillNess I think choice using `<|>` should be wholesale not partial. Combining the lists feels wrong to me, but I admit I'm arguing with you because it's interesting rather than because I feel strongly. I can't think of any laws yours violates, even though I tried! I could invent one: `a <|> b`=`a`, if `a`/=`empty`, or `b` otherwise, but inventing rules to try to win the argument doesn't seem to be fair! :) – AndrewC Aug 15 '13 at 13:16
  • @WillNess By the way, I've taken a rather liberal interpretation of `a` /= `empty` in that for parsers, `a`=`empty` would mean `a` fails _on the current input_. The fact that my invented law requires this degree of woolly reasoning certainly counts against it! I guess I mean `a <|> b` should be `a`, or if `a` fails, `b`. You can then use that to support your viewpoint! – AndrewC Aug 15 '13 at 13:24
  • And I thought I was delivering a winning punchline... ;) Re products and sums, I was thinking about [this](http://stackoverflow.com/a/15211856/849891) approach, taking apart application into its two constituents, juxtaposition and actual evaluation. Re "zippish" I think both [`union` and `equalsBy`](http://ideone.com/nuoLUE) on ordlists are "zippish", that's why I liked this proposition by sgm so much. :) Re choice, `p<|>q` lets both p and q succeed, it's only when we demand *just the first successful result* that it looks like it's exclusive. And parsers are Lists, not ZipLists anyway. :) – Will Ness Aug 15 '13 at 17:16
  • I guess we'll have to agree to disagree then. :) I'm at the end of my depth here. You would have made such a valuable supporter/promoter of this idea, I had to try. :) – Will Ness Aug 15 '13 at 20:09
  • @AndrewC ziplists are no parsers. Ziplists are rather like processes and because of lazy evaluation can be actually used as such. Therefore an `Alternative` instance should not just pick the longer list, for this uses "future" information. Only checking, whether the left argument is empty, relies to much on the past for my taste: Why should the later entries of `xs <|> ys` depend on whether there is a first entry? When calculating the hundredth entry this is long forgotten. – sgm Aug 21 '13 at 12:20
  • @sgm I like your argument about synchronous processes, but I think it's not terribly consistent with how you zip lists, where you stop as soon as one stream stops. I argue your definition is only coincidently possible because of the type signature of Alternative, but you could argue that zipping carries on as long as is possible (given the type signature)! Why don't you upvote and accept Will Ness' answer, since you agree with it. I'm not feeling terribly strongly, and you're both starting to convince me. – AndrewC Aug 26 '13 at 17:01
  • Note that Will and sgm have won me round to the point of editing Will's answer to include their good arguments, and it's got my upvote! I'm leaving this here though for an alternative viewpoint that leaves you with an assertion that there's not a canonical answer. – AndrewC Aug 27 '13 at 22:40
  • Thanks for the discussion, and right you are to leave this answer here, for a counterpoint. – Will Ness Aug 30 '13 at 08:58
2

There is in fact a sensible Alternative instance for ZipList. It comes from a paper on free near-semirings (which MonadPlus and Alternative are examples of):

instance Alternative ZipList where
  empty = ZipList []

  ZipList xs <|> ZipList ys = ZipList $ go xs ys where
    go [] bs = bs
    go as [] = as
    go (a:as) (_:bs) = a:go as bs

This is a more performant version of the original code for it, which was

ZipList xs <|> ZipList ys = ZipList $ xs ++ drop (length xs) ys
Zemyla
  • 468
  • 3
  • 7
1

My guiding intuition for Alternative comes from parsers which suggest that if one branch of your alternative fails somehow it should be eradicated thus leading to a Longest-style Alternative that probably isn't terrifically useful. This would be unbiased (unlike parsers) but fail on infinite lists.

Then again, all they must do, as you suggest, is form a Monoid. Yours is left biased in a way that ZipList doesn't usually embody, though—you could clearly form the reflected version of your Alternative instance just as easily. As you point out, this is the convention with Maybe as well, but I'm not sure there's any reason for ZipList to follow that convention.

There's no sensible some or many I don't believe, although few Alternatives actually have those—perhaps they'd have been better isolated into a subclass of Alternative.

Frankly, I don't think your suggestion is a bad instance to have, but I don't have any confidence about it being "the" alternative instance implied by having a ZipList. Perhaps it'd be best to see where else this kind of "extension" instance could apply (trees?) and write it as a library.

J. Abrahamson
  • 72,246
  • 9
  • 135
  • 180
  • Not only does picking the longer list fail for infinite lists, it is also not lazy: You have to see the whole list before you can decide on the first entry of the result. – sgm Aug 13 '13 at 17:16
  • Yeah, I can't claim that's a good intuition, just one way it's been taken before. – J. Abrahamson Aug 13 '13 at 18:22