24

I was wondering to what extent Functor instances in Haskell are determined (uniquely) by the functor laws.

Since ghc can derive Functor instances for at least "run-of-the-mill" data types, it seems that they must be unique at least in a wide variety of cases.

For convenience, the Functor definition and functor laws are:

class Functor f where
  fmap :: (a -> b) -> f a -> f b

fmap id = id
fmap (g . h) = (fmap g) . (fmap h)

Questions:

  • Can one derive the definition of map starting from the assumption that it is a Functor instance for data List a = Nil | Cons a (List a)? If so, what assumptions have to be made in order to do this?

  • Are there any Haskell data types which have more than one Functor instances which satisfy the functor laws?

  • When can ghc derive a functor instance and when can't it?

  • Does all of this depend how we define equality? The Functor laws are expressed in terms of an equality of values, yet we don't require Functors to have Eq instances. So is there some choice here?

Regarding equality, there is certainly a notion of what I call "constructor equality" which allows us to reason that [a,a,a] is "equal" to [a,a,a] for any value of a of any type even if a does not have (==) defined for it. All other (useful) notions of equality are probably coarser that this equivalence relationship. But I suspect that the equality in the Functor laws are more of an "reasoning equality" relationship and can be application specific. Any thoughts on this?

recursion.ninja
  • 5,377
  • 7
  • 46
  • 78
ErikR
  • 51,541
  • 9
  • 73
  • 124
  • `Either a b` can be a functor in two ways. So can `(a, b)`... Those are both trivial examples but I think it's not out of the question that there would be some non-trivial ones. – Benjamin Hodgson Nov 04 '13 at 18:55
  • 5
    @poorsod No it couldn't, with type currying the only way to implement it is to apply `f` to the value of `Right` otherwise noop – daniel gratzer Nov 04 '13 at 18:58
  • 5
    @jozefg You're right - I guess this is a point of friction between the _Haskell typeclass_ `Functor` and the _mathematical thing_ 'functor'. – Benjamin Hodgson Nov 04 '13 at 19:24
  • See also [Haskell Functor implied law](http://stackoverflow.com/questions/8305949/haskell-functor-implied-law). – Daniel Wagner Nov 04 '13 at 20:20

2 Answers2

26

See Brent Yorgey's Typeclassopedia:

Unlike some other type classes we will encounter, a given type has at most one valid instance of Functor. This can be proven via the free theorem for the type of fmap. In fact, GHC can automatically derive Functor instances for many data types.

duplode
  • 33,731
  • 7
  • 79
  • 150
Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
2

"When can GHC derive a functor instance and when can't it?"

When we have intentional circular data structures. The type system does not allow us to express our intent of a forced circularity. So, ghc can derive an instance, similar to that which we want, but not the same.


Circular data structures are probably the only case where the Functor should be implemented a different way. But then again, it would have the same semantic.

data HalfEdge a = HalfEdge { label :: a , companion :: HalfEdge a }

instance Functor HalfEdge where
    fmap f (HalfEdge a (HalfEdge b _)) = fix $ HalfEdge (f a) . HalfEdge (f b)

EDIT:

HalfEdges are structures (known in computer graphics, 3d meshes...) that represent undirected Edges in a graph, where you can have a reference to either end. Usually they store more references to neighbour HalfEdges, Nodes and Faces.

newEdge :: a -> a -> HalfEdge a
newEdge a b = fix $ HalfEdge a . HalfEdge b

Semantically, there is no fix $ HalfEdge 0 . HalfEdge 1 . HalfEdge 2, because edges are always composed out of exactly two half edges.


EDIT 2:

In the haskell community, the quote "Tying the Knot" is known for this kind of data structure. It is about data structures that are semantically infinite because they cycle. They consume only finite memory. Example: given ones = 1:ones, we will have these semantically equivalent implementations of twos:

twos = fmap (+1) ones
twos = fix ((+1)(head ones) :)

If we traverse (first n elements of) twos and still have a reference to the begin of that list, these implementations differ in speed (evaluate 1+1 each time vs only once) and memory consumption (O(n) vs O(1)).

comonad
  • 5,134
  • 2
  • 33
  • 31
  • This is no functor. It violates `fmap id y = y` for example for the value `y = fix $ HalfEdge 0 . HalfEdge 1 . HalfEdge 2`. Of course, there is only one correct instance: `fmap f (HalfEdge a rest) = HalfEdge (f a) $ fmap f rest`. This version also includes your implementation, which is a special case that only works for a circular structure with two nodes. – Johannes Gerer Apr 06 '15 at 04:30
  • @JohannesGerer: dang! thx. yes and no. I should have clarified that this is intended to be an actual "HalfEdge" data structure, not any structure that looks like this. editing... – comonad Apr 07 '15 at 14:48
  • I know that you only intended to use it this way, still: Why not use the correct functor instance, which is identical to yours for your intended use case AND correct for all possible use cases? Or stated differently, why is it that, "Functor should be implemented a different way"? – Johannes Gerer Apr 07 '15 at 18:26
  • @JohannesGerer "correct" functor instance? I guess you meant the "derived" functor instance, which is semantically the same as my instance, indeed. The derived would be a recursive functor application which produces an infinite list of HalfEdges... infinite in memory. Instead, it should be a recursive structure, with only two items pointing at each other. If you do not see the difference, read about "Tying the Knot" (google: haskell+Tying+the+Knot). – comonad Apr 07 '15 at 22:04
  • "Correct" means satisfying the functor laws for all possible values. There is only one such instance and it is the one given in my comment above. For your use case, it is not only semantically the same as your instance it will actually produce the same result as your instance! (Try it in GHC). Concerning infinite memory: You seem to be afraid of your own monster :-) Every HalfEdge object is an infinite object, no matter whether you tie the knot after 1, 2 or n edges or never! Haskells lazyness makes this (including your use case) possible. So my `fmap` will not make things worse than they are. – Johannes Gerer Apr 08 '15 at 14:55
  • 1
    As long you only follow a finite number of edges everything is fine. PS: Your incorrect functor implementation will, of course, also yield an infinite object! PPS: If it is the runtime difference (which btw depends on the compiler implementation), then you should say so in your answer. In this case I still would suggest another data structure (that holds only two real value and provides a cyclic acces to them) and give it a correct functor instance. Conclusion: Your program is just not type safe, if your types allow unsupported behavior and thus wasting one of the biggest strengths of Haskell! – Johannes Gerer Apr 08 '15 at 15:12
  • @JohannesGerer please read about memorization/caching of result values. this has nothing to do with haskells type system or laziness (non-strictness) and it is not compiler dependent. laziness is no magic and does not support additional memorization besides the obvious one. if you have 2 objects pointing to each other, then fmap a function along its infinite traversal, you will apply that function infinite many times on the same values, evaluating to infinite **different** but **equivalent** objects. those will only be garbage collected iff you do not hold a reference to the head of that list. – comonad Apr 09 '15 at 19:36
  • besides.. my functor implementation does NOT yield an infinite object, only a finite structure that points to itself. please read "Tying the Knot", because it seems that you did not understand its concept at all. – comonad Apr 09 '15 at 19:39
  • and read about CoData (you call it infinite objects) vs Data (finite objects). in imperative languages you can have data structures that can reference themselves (directly or indirectly recursive), like nodes in graphs. in strict pure functional languages, you only have finite tree or dag data structures. in non-strict (you call it lazy) functional languages, you can have data structures recursively referencing themselves AND CoData (like infinite lists). there was a time long ago, when there were no monads... not sure, miranda or haskell1.0... there was main::String→String. – comonad Apr 09 '15 at 20:06
  • Ok, sadly it got to a point where you do not actually reply to what I say. (And you would probably say the same about me). This is said as we probably can learn from each other and should try to show the world what a great language Haskell is. I sent you an email in case you are interested in a constructive conversation? – Johannes Gerer Apr 09 '15 at 20:24
  • sounds good. i guess we think alike and have different viewpoints only because of a different basic assumption. – comonad Apr 10 '15 at 09:48