22

I know this question has been asked and answered lots of times but I still don't really understand why putting constraints on a data type is a bad thing.

For example, let's take Data.Map k a. All of the useful functions involving a Map need an Ord k constraint. So there is an implicit constraint on the definition of Data.Map. Why is it better to keep it implicit instead of letting the compiler and programmers know that Data.Map needs an orderable key.

Also, specifying a final type in a type declaration is something common, and one can see it as a way of "super" constraining a data type.

For example, I can write

data User = User { name :: String }

and that's acceptable. However is that not a constrained version of

data User' s = User' { name :: s }

After all 99% of the functions I'll write for the User type don't need a String and the few which will would probably only need s to be IsString and Show.

So, why is the lax version of User considered bad:

data (IsString s, Show s, ...) => User'' { name :: s }

while both User and User' are considered good?

I'm asking this, because lots of the time, I feel I'm unnecessarily narrowing my data (or even function) definitions, just to not have to propagate constraints.

Update

As far as I understand, data type constraints only apply to the constructor and don't propagate. So my question is then, why do data type constraints not work as expected (and propagate)? It's an extension anyway, so why not have a new extension doing data properly, if it was considered useful by the community?

mb14
  • 22,276
  • 7
  • 60
  • 102
  • Sorry, I don't understand what you mean by "it doesn't propagate"? If you can't construct values that do not satisfy constraints, then your constraints propagate everywhere. – Shoe Jun 28 '14 at 10:11
  • 1
    I mean, apparently even if you add a contraint on a datatype you still have to repeat it on every single function declaration – mb14 Jun 28 '14 at 10:15
  • 1
    "even if you add a contraint on a datatype you still have to repeat it on every single function declaration" that is exactly correct. A constraint on the datatype: `data (Show a) => User a = ..` is not a proof that you have `Show a`, it is a requirement that the user must fulfill. And whenever you have a polymorphic type like `User a`, there is no way to infer that you have `Show a` unless you write it in the constraint of the function. – user2407038 Jun 28 '14 at 11:11
  • 2
    @user2407038 it could conceivably be read as a *promise* that every `User a` will be a `Show a` as well, and cause `Show a` constraint to be automatically attached to every function using `User a` (what the OP calls "propagating") - even to such functions that make no use of the `Show` interface. – Will Ness Jun 28 '14 at 11:42
  • 1
    @Will. That's exactly what I mean. You basically specify that `User a` needs a `Show a`, therefore the compiler knows that `User a` implies `Show a`. Lots of people seem to read it that way anway. – mb14 Jun 28 '14 at 12:11
  • 1
    @WillNess, @user2407038, You can have `User a` provide an implicit Show context if you use a GADT, as in my [answer](http://stackoverflow.com/a/24466703/1598537). – AndrewC Jun 28 '14 at 13:50
  • As I was going to say in my comment, this is a good question (+1), but I wonder if it doesn't better belong on Computer Science. – ouflak Jul 03 '14 at 16:34
  • "All of the useful functions involving a Map need an Ord k constraint." -- a useful function which does not need the constraint is `singleton` (and, though it's not a function, `empty` is another case where `Ord` would be unneeded). – Frerich Raabe Feb 23 '16 at 11:30

3 Answers3

16

TL;DR:
Use GADTs to provide implicit data contexts.
Don't use any kind of data constraint if you could do with Functor instances etc.
Map's too old to change to a GADT anyway. Scroll to the bottom if you want to see the User implementation with GADTs


Let's use a case study of a Bag where all we care about is how many times something is in it. (Like an unordered sequence. We nearly always need an Eq constraint to do anything useful with it.

I'll use the inefficient list implementation so as not to muddy the waters over the Data.Map issue.

GADTs - the solution to the data constraint "problem"

The easy way to do what you're after is to use a GADT:

Notice below how the Eq constraint not only forces you to use types with an Eq instance when making GADTBags, it provides that instance implicitly wherever the GADTBag constructor appears. That's why count doesn't need an Eq context, whereas countV2 does - it doesn't use the constructor:

{-# LANGUAGE GADTs #-}

data GADTBag a where
   GADTBag :: Eq a => [a] -> GADTBag a
unGADTBag (GADTBag xs) = xs

instance Show a => Show (GADTBag a) where
  showsPrec i (GADTBag xs) = showParen (i>9) (("GADTBag " ++ show xs) ++)

count :: a -> GADTBag a -> Int -- no Eq here
count a (GADTBag xs) = length.filter (==a) $ xs  -- but == here

countV2 a = length.filter (==a).unGADTBag

size :: GADTBag a -> Int
size (GADTBag xs) = length xs
ghci> count 'l' (GADTBag "Hello")
2
ghci> :t countV2
countV2 :: Eq a => a -> GADTBag a -> Int

Now we didn't need the Eq constraint when we found the total size of the bag, but it didn't clutter up our definition anyway. (We could have used size = length . unGADTBag just as well.)

Now lets make a functor:

instance Functor GADTBag where
  fmap f (GADTBag xs) = GADTBag (map f xs)

oops!

DataConstraints_so.lhs:49:30:
    Could not deduce (Eq b) arising from a use of `GADTBag'
    from the context (Eq a)

That's unfixable (with the standard Functor class) because I can't restrict the type of fmap, but need to for the new list.

Data Constraint version

Can we do as you asked? Well, yes, except that you have to keep repeating the Eq constraint wherever you use the constructor:

{-# LANGUAGE DatatypeContexts #-}

data Eq a => EqBag a = EqBag {unEqBag :: [a]}
  deriving Show

count' a (EqBag xs) = length.filter (==a) $ xs
size' (EqBag xs) = length xs   -- Note: doesn't use (==) at all

Let's go to ghci to find out some less pretty things:

ghci> :so DataConstraints
DataConstraints_so.lhs:1:19: Warning:
    -XDatatypeContexts is deprecated: It was widely considered a misfeature, 
    and has been removed from the Haskell language.
[1 of 1] Compiling Main             ( DataConstraints_so.lhs, interpreted )
Ok, modules loaded: Main.
ghci> :t count
count :: a -> GADTBag a -> Int
ghci> :t count'
count' :: Eq a => a -> EqBag a -> Int
ghci> :t size
size :: GADTBag a -> Int
ghci> :t size'
size' :: Eq a => EqBag a -> Int
ghci> 

So our EqBag count' function requires an Eq constraint, which I think is perfectly reasonable, but our size' function also requires one, which is less pretty. This is because the type of the EqBag constructor is EqBag :: Eq a => [a] -> EqBag a, and this constraint must be added every time.

We can't make a functor here either:

instance Functor EqBag where
   fmap f (EqBag xs) = EqBag (map f xs)

for exactly the same reason as with the GADTBag

Constraintless bags

data ListBag a = ListBag {unListBag :: [a]}
  deriving Show
count'' a = length . filter (==a) . unListBag
size'' = length . unListBag

instance Functor ListBag where
   fmap f (ListBag xs) = ListBag (map f xs)

Now the types of count'' and show'' are exactly as we expect, and we can use standard constructor classes like Functor:

ghci> :t count''
count'' :: Eq a => a -> ListBag a -> Int
ghci> :t size''
size'' :: ListBag a -> Int
ghci> fmap (Data.Char.ord) (ListBag "hello")
ListBag {unListBag = [104,101,108,108,111]}
ghci> 

Comparison and conclusions

The GADTs version automagically propogates the Eq constraint everywhere the constructor is used. The type checker can rely on there being an Eq instance, because you can't use the constructor for a non-Eq type.

The DatatypeContexts version forces the programmer to manually propogate the Eq constraint, which is fine by me if you want it, but is deprecated because it doesn't give you anything more than the GADT one does and was seen by many as pointless and annoying.

The unconstrained version is good because it doesn't prevent you from making Functor, Monad etc instances. The constraints are written exactly when they're needed, no more or less. Data.Map uses the unconstrained version partly because unconstrained is generally seen as most flexible, but also partly because it predates GADTs by some margin, and there needs to be a compelling reason to potentially break existing code.

What about your excellent User example?

I think that's a great example of a one-purpose data type that benefits from a constraint on the type, and I'd advise you to use a GADT to implement it.

(That said, sometimes I have a one-purpose data type and end up making it unconstrainedly polymorphic just because I love to use Functor (and Applicative), and would rather use fmap than mapBag because I feel it's clearer.)

{-# LANGUAGE GADTs #-}
import Data.String

data User s where 
   User :: (IsString s, Show s) => s -> User s

name :: User s -> s
name (User s) = s

instance Show (User s) where  -- cool, no Show context
  showsPrec i (User s) = showParen (i>9) (("User " ++ show s) ++)

instance (IsString s, Show s) => IsString (User s) where
  fromString = User . fromString

Notice since fromString does construct a value of type User a, we need the context explicitly. After all, we composed with the constructor User :: (IsString s, Show s) => s -> User s. The User constructor removes the need for an explicit context when we pattern match (destruct), becuase it already enforced the constraint when we used it as a constructor.

We didn't need the Show context in the Show instance because we used (User s) on the left hand side in a pattern match.

AndrewC
  • 32,300
  • 7
  • 79
  • 115
  • I understand that the current DataTypeContext doesn't bring anything except than hassle. I didn't know GADT "propagates" contraints. However is that not a contrieved way to use GADT which I understood where kind of made obsolote by type families ? Wouldn't that not be better to "fix" the DatatypeContexts to something usefull ? – mb14 Jun 28 '14 at 12:52
  • 1
    @mb14 It's a perfectly valid use for a GADT. GADTs are of wide application, which is one of the reasons they exist. Eg. you could also use a GADT to replace existential types by omitting the `a` parameter. I feel Type Families or using a type class at all is certainly overkill to replace a single data type, and could make things far more complicated than necessary. GADTs work here because they make things simpler. (GADTs are complementary to Type Families, not replaced by them. Certain problems can be solved by either, but it depends on the specific problem at hand. Some solutions use both.) – AndrewC Jun 28 '14 at 13:39
  • @mb14 In any case, wouldn't your fix to DatatypeContexts be exactly the GADT thing I showed here? (An implicit Eq or Ord context wherever the constructor appears.) – AndrewC Jun 28 '14 at 13:40
  • If I understand well, GADT add the constraint per constructor not for the all type, so I couldn't declare an equivalent of `Ord k => Map k a`, because there is no need of the `Ord k` on a Map constructor (`singleton` doesn't have it). `Ord k` is only needed to *combine* maps. – mb14 Jun 28 '14 at 13:57
  • Anyway I'm sure GADT does the trick, but it seems that the initial DatatypeContexts was made to be what I (and other people) would expect, but didn't and did something else ... which is useless. Therefore it's not been depricated because the concept of `DatatypeContext` is bad (as other people are still trying to convince me), but just because it's actual specification is wrong. – mb14 Jun 28 '14 at 14:09
  • @mb14 On a historical note, a data type context used to be allowable without the language extension and wasn't deprecated, so they didn't so much make an extension badly and fail to fix it, they just failed to notice the feature they had was irritating, and by the time they came to fix it, GADTs had solved the problem, so they made it off by default (turned it into a language extension) and deprecated it. – AndrewC Jun 28 '14 at 14:25
  • @mb14 I wholeheartedly agree with you that the concept isn't a problem at all. I think they deprecated it because they felt they'd made a mistake and they should have invented something along the line of GADTs instead, and they'd rather teach people about the rather more flexible GADT and have one extension than have DataTypeContexts both duplicate and restrict the functionality of GADTs. – AndrewC Jun 28 '14 at 14:25
  • @mb14 Anyway, I've popped in a GADT solution to the User problem in your question at the bottom of my answer - is it the sort of thing you had hoped for? – AndrewC Jun 28 '14 at 14:27
  • My `User` example was just a made up example. In fact, I'm using `Data.Map` and I'm annoyed that for everything single function I'm trying to write which uses `lookup` or anything else, I need to add `Ord k` in the type signature. Therefore, As I know in my current context, I only deal with strings, I endup writing `Map String a -> ...` instead of polymorphic function. The other solution would be to not write a signature at all. This seems a flaw in the language and I was wandering if I am the only one bothered by it, or is there any effort to "fix" that. – mb14 Jun 28 '14 at 14:38
  • @mb14 Ah, so in the light of my answer, your question becomes "Why isn't Data.Map now written with a GADT to remove the irritating Ord context?" Its fair to say that Data.Map has the same irritating problem they deprecated away with DatatypeContexts. – AndrewC Jun 28 '14 at 15:05
  • 2
    Have you ever tried to use something like GADTs to write something like `Data.Map`? Well, you don't get rid of the instance requirement on any function that creates a `Map` without getting one as input. That makes things like `singleton` strictly less useful, since it doesn't have an `Ord k` constraint right now. And mostly, you just make documentation lie. It's not like those functions work on types without an `Ord` instance. It's better for users of your library if your docs mention `Ord` is required in the type than pretend it isn't. – Carl Jun 28 '14 at 19:01
  • 2
    @Carl Using a GADT would switch the Ord context to only be present when supplying new keys, so yes, `singleton` (and `insert`) would have an Ord context, but using singleton on a key type that prevents you from using other map functions is useless anyway. `singleton id 7`? Also `fromList` and `fromSet` would get Ord contexts (`mapKeys` has one anyway for the new key type), but combining functions like `union`, `intersection`, `difference` would lose them. It's not lying, it's reducing the location. The docs would mention Ord as a context when you _make_ a map, not when you use it. – AndrewC Jun 28 '14 at 20:34
  • @Carl I guess what I'm saying is that for practical use, you'll need to stick to a type that has an Ord instance, so saying singleton becomes less useful isn't of great practical impace, whereas littering your code (that uses Maps) with Ord contexts is of practical impact when writing code. We can reduce typing but retain usefulness. It's not a problem. – AndrewC Jun 28 '14 at 20:40
  • It's absolutely lying. You can't use those functions with any choice of `k` that isn't in `Ord`. So why isn't that what the type signature says? – Carl Jun 28 '14 at 21:23
  • @Carl There's a big difference between implicit and deception. The do notation is implicitly passing state from one line to another in the State and Parsec monads - we declare it explicitly once and use it implicitly all the time. The failure to explicitly pass the state from one line to the next is why it's great, but it's hidden. Similarly, in a GADT, _any_ GADT with a constraint on the constructor, the constraint is implicit when it's in a pattern match and explicit otherwise. Implicit things aren't lying, they're economy of code, and only misleading if you're not used to it (cf monad fear) – AndrewC Jun 28 '14 at 22:09
  • (@Carl By the way, earlier today I tried but failed to find a question with good answers about the difference between Monad and Applicative, but was delighted to find it by accident when I clicked on your profile. I've now [linked to it](http://stackoverflow.com/questions/24467803/how-to-handle-side-effect-with-applicative?noredirect=1#comment37874708_24467994) to help out that OP more. I was hoping there would be a good explanation somewhere on SO and was pleased when I found one I already liked a lot.) – AndrewC Jun 28 '14 at 22:24
  • The definition of `fmap` for constraintless bags relies on the fact that these bags have a very flat representation. If they were even implemented as lists of pairs of items and their counts, it would become invalid. – dfeuer Jun 29 '14 at 07:12
  • @dfeuer No, for lists of counts you just need to apply f to every single value of type a you find. `map (\(a,i) -> (f a,i))` . If you're using sorting on the elements however, then you're right, because you cannot assume f preserves order. All the same the point stands that if you reduce the constraints on the data, you get more Functor and Monad instances. If you're pointing out that you might choose to sacrifice Functor for efficiency, like Set does, I agree with that. In those cases, GADTs suit nicely. – AndrewC Jun 29 '14 at 07:26
  • @AndrewC, the trouble is that if `f` is not one-to-one, then the counts won't get merged properly. – dfeuer Jun 29 '14 at 15:54
11

Constraints

The problem is that constraints are not a property of the data type, but of the algorithm/function that operates on them. Different functions might need different and unique constraints.

A Box example

As an example, let's assume we want to create a container called Box which contains only 2 values.

data Box a = Box a a

We want it to:

  • be showable
  • allow the sorting of the two elements via sort

Does it make sense to apply the constraint of both Ord and Show on the data type? No, because the data type in itself could be only shown or only sorted and therefore the constraints are related to its use, not it's definition.

instance (Show a) => Show (Box a) where
    show (Box a b) = concat ["'", show a, ", ", show b, "'"]

instance (Ord a) => Ord (Box a) where
    compare (Box a b) (Box c d) =
        let ca = compare a c
            cb = compare b d
        in if ca /= EQ then ca else cb

The Data.Map case

Data.Map's Ord constraints on the type is really needed only when we have > 1 elements in the container. Otherwise the container is usable even without an Ord key. For example, this algorithm:

transf :: Map NonOrd Int -> Map NonOrd Int
transf x = 
    if Map.null x
        then Map.singleton NonOrdA 1
        else x

Live demo

works just fine without the Ord constraint and always produce a non empty map.

Shoe
  • 74,840
  • 36
  • 166
  • 272
  • 1
    In your Box example yes, but in my Map example, a Map with a key which is not sortable is pretty useless. – mb14 Jun 28 '14 at 10:14
  • 2
    Ok but you don't really need a `Map` if you are only getting one element in it, don't you ? – mb14 Jun 28 '14 at 10:34
  • @mb14 No, you probably don't. It's just an example to prove the point that constraints are not a property of the data type. – Shoe Jun 28 '14 at 10:36
  • 4
    `Map` only really *has* to have keys that are `Eq`. Having them be `Ord` is an *efficiency* issue. – Will Ness Jun 28 '14 at 11:37
  • @AndrewC Woops. Leftover from previous testings. Thanks. – Shoe Jun 28 '14 at 23:49
1

Using DataTypeContexts reduces the number of programs you can write. If most of those illegal programs are nonsense, you might say it's worth the runtime cost associated with ghc passing in a type class dictionary that isn't used. For example, if we had

data Ord k => MapDTC k a

then @jefffrey's transf is rejected. But we should probably have transf _ = return (NonOrdA, 1) instead.

In some sense the context is documentation that says "every Map must have ordered keys". If you look at all of the functions in Data.Map you'll get a similar conclusion "every useful Map has ordered keys". While you can create maps with unordered keys using

mapKeysMonotonic :: (k1 -> k2) -> Map k1 a -> Map k2 a
singleton :: k2 a -> Map k2 a

But the moment you try to do anything useful with them, you'll wind up with No instance for Ord k2 somewhat later.

aavogt
  • 1,308
  • 6
  • 14