1

My mental model of instance heads is they're type-level pattern matching, with binding of tyvars like binding of variables. With FlexibleInstances you can repeat tyvars in instance heads. But you can't do that at term level. Consider

class (Eq a, Eq b) => C a b  where
  foo :: a -> b -> String

instance C a a  where              -- repeated `a'
  -- the args are same type so we can compare them
  foo ...   = "equal"              -- see below
  foo _x _y = "not equal"

instance {-# OVERLAPPABLE #-} C a b  where
  foo _  _  = "not comparable"

For that "equal" case, I'd like to write

foo  x   x  = "equal"

But repeated variables aren't allowed in patterns: rejected Conflicting definitions for 'x'. And that's because each variable must get uniquely bound to the arguments to foo then we can reference them. Instead go

foo  x   y  | x == y = "equal"

So how come the repeated tyvar business works for instances? In instance C a a, which of the as is the binding site, which the referencing site? Is the compiler under the covers doing something equivalent to binding to differently named tyvars then applying a guard?

AntC
  • 2,623
  • 1
  • 13
  • 20
  • 1
    Not sure, but I think it's because it's very easy to compare types for equality, but you can't compare all values for equality. What happens if you choose `a ~ Int -> Int`, for example? Then suddenly you can't compare two values of type `a` any more. For more info on why this is disallowed for the value-level, you could look at https://stackoverflow.com/questions/1179008/pattern-matching-identical-values. – bradrn Jun 19 '19 at 13:14
  • 3
    This looks like a "pattern matching vs unification" aspect. When GHC matches against instance heads, it uses unification between the instance heads (which might contain type variables) and the constraint to be solved (which might also contain type variables). Instead, in patterns we match a value (which can NOT contain variables) against the pattern (which might contain variables), so pattern matching suffices. Also note that one happens between types, the other between values. – chi Jun 19 '19 at 13:18
  • @bradm "you can't compare all values for equality." Quite. So I've put `Eq` constraints on the class, and I only try to compare when the two types are the same -- i.e. `instance C a a`. – AntC Jun 20 '19 at 00:37
  • @chi " When GHC matches against instance heads, it uses unification between the instance heads" I don't think that's the explanation. There's no `~` constraint to unify the tyvars after matching the instance head. Note I have an `OVERLAPPABLE` instance. So GHC needs to choose between those instances _before_ it applies any unification. – AntC Jun 20 '19 at 00:43
  • 1
    @AntC I think GHC uses unification against both instance heads, and then chooses the "most specific" unifying head, or something similar. Even when there is no `~` you need unification anyway to resolve general constraints. – chi Jun 20 '19 at 12:04

1 Answers1

1

Instance selection mostly takes place in ghc/compiler/types/InstEnv.hs and it uses "matching", a restricted form of unification where a template with variables is unified with another type by substitution of those template variables.

So, when GHC decides it needs an instance dictionary for C Int Int, it attempts to match/unify C Int Int with the templates C a a and C a b, in that order (i.e., by determining through application of the function tcMatchTys from ghc/compiler/types/Unify.hs if there is a substitution for the variables a and/or b that allows one of those templates to match C Int Int).

It's an implementation detail in tcMatchTys, but it looks to me like C Int Int would be matched to the template C a a by binding a to Int based on the first a and then verifying that a is bound to Int based on the second a, so I guess the first a is the binding site and the second a is the referencing site, though it would be more accurate to say that they're both unification/matching sites. There would be no change to semantics if you changed the order.

Handling of overlapping instances takes place after this matching process. The full list of matching templates is generated and then folded by insert_overlapping, in lookupInstEnv:

-- in function `lookupInstEnv`, file `InstEnv.hs`
final_matches = foldr insert_overlapping [] all_matches

The function insert_overlapping filters out strictly more general instances wherever overlapping is allowed (and also processes incoherent instances). The final list is then matched to a singleton list, and it's an error if either zero or more than one template matches.

Note that the matching process above can involve polymorphic substitutions, so:

gack :: [t] -> String
gack xs = foo xs xs

uses the same matching process to determine that C [t] [t] matches the template C a a under the substitution a := [t]. (It also matches C a b under the substitutions a := [t], b := [t], but the dominant match wins.)

K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71
  • That's a useful answer, but I worry a bit: " templates are ordered so the correct match will be the first one". You can't always do that, instances `C Int b`, `C a Bool` are in no substitution ordering. So if your Wanted is `C Int Bool`, that's rejected even if instances marked `INCOHERENT`. – AntC Jun 26 '19 at 02:26
  • I'm not sure what you mean here in the last bit -- that situation is precisely what incoherent instances are for, and it should work fine, selecting the unique coherent instance if only one is marked incoherent, or picking one arbitrarily if both are marked incoherent. – K. A. Buhr Jun 26 '19 at 13:54
  • To your other point, it looks like there are some misleading obsolete comments in the compiler source, and the list is now sorted (or more accurately "filtered") at instantiation time rather than being sorted in advance. I've updated that part of my answer. – K. A. Buhr Jun 26 '19 at 20:31