62

A discussion came up at work recently about Sets, which in Scala support the zip method and how this can lead to bugs, e.g.

scala> val words = Set("one", "two", "three")
scala> words zip (words map (_.length))
res1: Set[(java.lang.String, Int)] = Set((one,3), (two,5))

I think it's pretty clear that Sets shouldn't support a zip operation, since the elements are not ordered. However, it was suggested that the problem is that Set isn't really a functor, and shouldn't have a map method. Certainly, you can get yourself into trouble by mapping over a set. Switching to Haskell now,

data AlwaysEqual a = Wrap { unWrap :: a }

instance Eq (AlwaysEqual a) where
    _ == _ = True

instance Ord (AlwaysEqual a) where
    compare _ _ = EQ

and now in ghci

ghci> import Data.Set as Set
ghci> let nums = Set.fromList [1, 2, 3]
ghci> Set.map unWrap $ Set.map Wrap $ nums
fromList [3]
ghci> Set.map (unWrap . Wrap) nums
fromList [1, 2, 3]

So Set fails to satisfy the functor law

    fmap f . fmap g = fmap (f . g)

It can be argued that this is not a failing of the map operation on Sets, but a failing of the Eq instance that we defined, because it doesn't respect the substitution law, namely that for two instances of Eq on A and B and a mapping f : A -> B then

    if x == y (on A) then f x == f y (on B)

which doesn't hold for AlwaysEqual (e.g. consider f = unWrap).

Is the substition law a sensible law for the Eq type that we should try to respect? Certainly, other equality laws are respected by our AlwaysEqual type (symmetry, transitivity and reflexivity are trivially satisfied) so substitution is the only place that we can get into trouble.

To me, substition seems like a very desirable property for the Eq class. On the other hand, some comments on a recent Reddit discussion include

"Substitution seems stronger than necessary, and is basically quotienting the type, putting requirements on every function using the type."

-- godofpumpkins

"I also really don't want substitution/congruence since there are many legitimate uses for values which we want to equate but are somehow distinguishable."

-- sclv

"Substitution only holds for structural equality, but nothing insists Eq is structural."

-- edwardkmett

These three are all pretty well known in the Haskell community, so I'd be hesitant to go against them and insist on substitability for my Eq types!

Another argument against Set being a Functor - it is widely accepted that being a Functor allows you to transform the "elements" of a "collection" while preserving the shape. For example, this quote on the Haskell wiki (note that Traversable is a generalization of Functor)

"Where Foldable gives you the ability to go through the structure processing the elements but throwing away the shape, Traversable allows you to do that whilst preserving the shape and, e.g., putting new values in."

"Traversable is about preserving the structure exactly as-is."

and in Real World Haskell

"...[A] functor must preserve shape. The structure of a collection should not be affected by a functor; only the values that it contains should change."

Clearly, any functor instance for Set has the possibility to change the shape, by reducing the number of elements in the set.

But it seems as though Sets really should be functors (ignoring the Ord requirement for the moment - I see that as an artificial restriction imposed by our desire to work efficiently with sets, not an absolute requirement for any set. For example, sets of functions are a perfectly sensible thing to consider. In any case, Oleg has shown how to write efficient Functor and Monad instances for Set that don't require an Ord constraint). There are just too many nice uses for them (the same is true for the non-existant Monad instance).

Can anyone clear up this mess? Should Set be a Functor? If so, what does one do about the potential for breaking the Functor laws? What should the laws for Eq be, and how do they interact with the laws for Functor and the Set instance in particular?

Chris Taylor
  • 46,912
  • 15
  • 110
  • 154
  • It would be a nice point for [ML](https://groups.google.com/forum/#!forum/scala-language) rather than stackoverflow Q/A, at least a *Set zip is harmful* part. – om-nom-nom Oct 04 '13 at 09:05
  • Feel free to cross-post it there. I agree that this question may be a bit too subjective, or a bit too broad, for SO, but I thought I'd toss it out there anyway and see if it sticks. – Chris Taylor Oct 04 '13 at 09:09
  • Do have any example to show the usefulness of Set being a `Functor`. IMHO Set doesn't fit to be `Functor`. – Ankur Oct 04 '13 at 09:23
  • @Ankur Certainly. Say you are writing a library for doing certain AI calculations, which require you to "score" some collection of possible actions. You don't know if you'll get a list of actions, or a set, or a queue - you just know that you'll get a collection with some elements, so you write your `score` function to accept any `Functor`. If `Set` is a `Functor`, you can get more efficient code since actions with identical scores are discarded as you go. If you used a list, you would have to manually discard identical scores at the end (using `List.nub` for example). – Chris Taylor Oct 04 '13 at 09:38
  • @ChrisTaylor: I don't think I will generalize `get a collection with some elements` to a `Functor`. `IO`, `Maybe` and other types which are not collection are `Functor` and I am not sure if `Functor` has anything to do with collection of elements. In this case I will probably use some other generalization to represent a collection of elements. – Ankur Oct 04 '13 at 09:52
  • 7
    Maybe i'm wrong, but i don't think that collections in Scala are meant to be functors, actually there are no such things in Scala as Functors, Monads, Applicatives and other stuff in the standard library, cause prof. Odersky didn't want to have them. Maybe in Haskell it's essential for the language, but in Scala, i think, Set is just a Set and if you need Functors, Monads, etc.. use Scalaz –  Oct 04 '13 at 09:59
  • 3
    @levy Rather than "Functor" you can just read "the set of types that supports a `map` operation". That's all that `Functor` is (well, that plus some laws). – Chris Taylor Oct 04 '13 at 11:49
  • 2
    @ChrisTaylor I agree with you, but the idea behind map in Scala, and behind map in Haskell are different. In Haskell map is related to Functors, Arrows and all that mathematical stuff from CatThreory, but in Scala map is just "take this `addOne` function and apply it to each `Int`". I doubt that biggest part of Scala Devs thinks about `flatMap/bind` in terms of Monads and some abstraction over computation flow with context, it's just call function and from `List[List[A]]` make `List[A]`. –  Oct 04 '13 at 12:17
  • @ChrisTaylor Similar situation with map in Set, it doesn't matter wether it is a functor or not, i just want to add 1 to each number in the this Set, that's all –  Oct 04 '13 at 12:18
  • 2
    The `map` for `Set` is not the `Functor` map! – tibbe Oct 05 '13 at 01:06
  • 3
    I think we can say `Set` is a functor on the subcategory whose objects are types with a sane `Eq`/`Ord` instance (here "sane" includes substitutability). – Daniel Wagner Oct 05 '13 at 03:53
  • I've always been mystified by the fact that `Eq` is Haskell is not necessarily structural, so for other people like me, this other answer helped me to understand that: http://stackoverflow.com/a/12504423/293735 Once you accept that, it's relatively obvious how the useful (for algebraic reasoning and optimizations) Functor laws cannot be relied upon for `Set`s. Another `RestrictedFunctor` typeclass could be defined for such Functors that don't have such guarantees, but the Haskell Functor is a different beast. (or at least this is my rationalization) – berdario Feb 07 '15 at 12:46

3 Answers3

30

Another argument against Set being a Functor - it is widely accepted that being a Functor allows you to transform the "elements" of a "collection" while preserving the shape. [...] Clearly, any functor instance for Set has the possibility to change the shape, by reducing the number of elements in the set.

I'm afraid that this is a case of taking the "shape" analogy as a defining condition when it is not. Mathematically speaking, there is such a thing as the power set functor. From Wikipedia:

Power sets: The power set functor P : Set → Set maps each set to its power set and each function f : X → Y to the map which sends U ⊆ X to its image f(U) ⊆ Y.

The function P(f) (fmap f in the power set functor) does not preserve the size of its argument set, yet this is nonetheless a functor.

If you want an ill-considered intuitive analogy, we could say this: in a structure like a list, each element "cares" about its relationship to the other elements, and would be "offended" if a false functor were to break that relationship. But a set is the limiting case: a structure whose elements are indifferent to each other, so there is very little you can do to "offend" them; the only thing is if a false functor were to map a set that contains that element to a result that doesn't include its "voice."

(Ok, I'll shut up now...)


EDIT: I truncated the following bits when I quoted you at the top of my answer:

For example, this quote on the Haskell wiki (note that Traversable is a generalization of Functor)

"Where Foldable gives you the ability to go through the structure processing the elements but throwing away the shape, Traversable allows you to do that whilst preserving the shape and, e.g., putting new values in."

"Traversable is about preserving the structure exactly as-is."

Here's I'd remark that Traversable is a kind of specialized Functor, not a "generalization" of it. One of the key facts about any Traversable (or, actually, about Foldable, which Traversable extends) is that it requires that the elements of any structure have a linear order—you can turn any Traversable into a list of its elements (with Foldable.toList).

Another, less obvious fact about Traversable is that the following functions exist (adapted from Gibbons & Oliveira, "The Essence of the Iterator Pattern"):

-- | A "shape" is a Traversable structure with "no content," 
-- i.e., () at all locations.
type Shape t = t ()

-- | "Contents" without a shape are lists of elements.
type Contents a = [a]

shape :: Traversable t => t a -> Shape t
shape = fmap (const ())

contents :: Traversable t => t a -> Contents a
contents = Foldable.toList

-- | This function reconstructs any Traversable from its Shape and
-- Contents.  Law:
--
-- > reassemble (shape xs) (contents xs) == Just xs
--
-- See Gibbons & Oliveira for implementation.  Or do it as an exercise.
-- Hint: use the State monad...
--
reassemble :: Traversable t => Shape t -> Contents a -> Maybe (t a)

A Traversable instance for sets would violate the proposed law, because all non-empty sets would have the same Shape—the set whose Contents is [()]. From this it should be easy to prove that whenever you try to reassemble a set you would only ever get the empty set or a singleton back.

Lesson? Traversable "preserves shape" in a very specific, stronger sense than Functor does.

Luis Casillas
  • 29,802
  • 7
  • 49
  • 102
  • 1
    This is interesting, but so is `Set` a `Functor`, according to your answer? 1. The only difference between `Set` and powerset is that powerset does not use `Eq` but actual equality, which is guaranteed to be substitutive. 2. "`Functor` preserves shape" in the sense given by functor laws — it just applies the function to the elements. As observed, `Set` is only a functor assuming substitutivity of `Eq`. – Blaisorblade Aug 01 '14 at 15:22
11

Set is "just" a functor (not a Functor) from the subcategory of Hask where Eq is "nice" (i.e. the subcategory where congruence, substitution, holds). If constraint kinds were around from way back then perhaps set would be a Functor of some kind.

J. Abrahamson
  • 72,246
  • 9
  • 135
  • 180
  • Thanks - I think this is a nice way of looking at it. I tend to get too hung up on functors from Hask to Hask and forget about the rest of them... – Chris Taylor Oct 05 '13 at 08:23
  • At the very least `Contravariant` functors are nice to have in your toolbox, as they have a very nice Haskell representation. The subcategory ones are good thinking tools, though. – J. Abrahamson Oct 05 '13 at 17:11
  • IMHO, `instance SubstitutiveEq e => Functor (Set e)` would be enough, if `SubstitutiveEq` were an operationless typeclass with only laws (something I've never seen in Haskell). – Blaisorblade Aug 01 '14 at 15:24
  • 1
    @Blaisorblade surprise! As of 7.10, `MonadPlus` is an empty typeclass (well, empty save for with redundant operations that are only aliases of their equivalent generalizations) that only provides laws :) – Justin L. Jul 30 '15 at 10:42
2

Well, Set can be treated as a covariant functor, and as a contravariant functor; usually it's a covariant functor. And for it to behave regarding equality one has to make sure that whatever the implementation, it does.

Regarding Set.zip - it is nonsense. As well as Set.head (you have it in Scala). It should not exist.

Vlad Patryshev
  • 1,379
  • 1
  • 10
  • 16
  • 1
    `Set.head` is actually useful if you have a non-empty set and you want to get an arbitrary element of that set, doesn't matter which. In case of the one-element set, it's the only element of that set. Of course, the name `head` isn't the most fortunate. It should be called `Set.arbitrary` or something. – Karol S Jan 22 '14 at 14:09