2

Is there any strategy for creating a bijection between two datatypes? For example, consider those datatypes:

data Colbit 
    = White Colbit Colbit 
    | Black Colbit Colbit 
    | Tip

data Bits
    = B0 Bits
    | B1 Bits
    | BEnd

Plus the constraint that a valid element of Colbit must have an odd number of nodes (White/Black constructors). How can I create a map:

toColbit :: Bits -> Colbit
fromColbit :: Colbit -> Bits

Such that for all b : Bits, fromColbit (toColbit b) == b, and for all c : Colbit, toColbit (fromColbit c) == c? (Also, what is this property called?)

MaiaVictor
  • 51,090
  • 44
  • 144
  • 286
  • 2
    I think there is no general strategy how to create a bijection between the two types because you have to define the bijection which is based on the logic of the two types. Can you explain what you are trying to do? – Martin Mar 25 '16 at 22:05
  • I'm trying to map elements of the datatype `Colbit` to binary strings, so I can perform a brute-force search on them looking for an element that satisfies some specific properties. I also want to be able to store an element on disk with the minimal amount of bits possible. (If I just serialize it naively, there will be some bitstrings that don't correspond to an element of Colbit, wasting space.) – MaiaVictor Mar 25 '16 at 22:09
  • And why don't you just map them? I didn't get the point. – Martin Mar 25 '16 at 22:13
  • I don't know how to map elements of Colbit to binary strings in a way that every binary string corresponds to an element of Colbit and vice versa. That's what I'm asking. – MaiaVictor Mar 25 '16 at 22:20
  • Sorry, my fault. Got it from the answer. – Martin Mar 25 '16 at 22:23
  • 1
    This paper is the correct answer to the question I think: http://research.microsoft.com/en-us/people/dimitris/every-bit-counts.pdf (anyone should feel free to summarize the key bits to make this a real answer) – sclv Mar 25 '16 at 22:27
  • (and the code from the paper is on hackage too: https://hackage.haskell.org/package/every-bit-counts) – sclv Mar 25 '16 at 22:28
  • That paper answers every question ever? I'll think how it applies to this. – MaiaVictor Mar 25 '16 at 22:31
  • I mean to say, that paper gives the general theory of encodings such as this. :-) – sclv Mar 26 '16 at 00:09
  • @sclv the problem is finding the right game. I have no idea how to approach this one. The constraint makes it even harder. – MaiaVictor Mar 26 '16 at 03:07
  • Gotcha! I'll give it a think. My first move would be to create a datatype that encodes the constraint fully, and from there i'd take the datatype to the bit encoding as a second step. – sclv Mar 26 '16 at 03:08
  • How could the datatype encode the constraint without dependent type? And the paper doesn't cover dependent types, so... – MaiaVictor Mar 26 '16 at 03:44

1 Answers1

10

Step 1 is to convert the odd-ness constraint of Colbit to the type level:

{-# LANGUAGE TypeSynonymInstances #-}

data Color = Black | White deriving (Bounded, Enum, Eq, Ord, Read, Show)
data Odd = Evens Color Even Even | Odds Color Odd Odd deriving (Eq, Ord, Read, Show)
data Even = Tip | OddL Color Odd Even | OddR Color Even Odd deriving (Eq, Ord, Read, Show)
type Colbit = Odd

Then you can play the tricks I described in my previous answer to one of your questions to build a bijection with the naturals. Recalling the preamble:

type Nat = Integer
class Godel a where
    to :: a -> Nat
    from :: Nat -> a

instance Godel Nat where to = id; from = id

-- you should probably fix this instance to not use
-- Double if you plan to use it for anything serious
instance (Godel a, Godel b) => Godel (a, b) where
    to (m_, n_) = (m + n) * (m + n + 1) `quot` 2 + m where
        m = to m_
        n = to n_
    from p = (from m, from n) where
        isqrt    = floor . sqrt . fromIntegral
        base     = (isqrt (1 + 8 * p) - 1) `quot` 2
        triangle = base * (base + 1) `quot` 2
        m = p - triangle
        n = base - m

instance (Godel a, Godel b) => Godel (Either a b) where
    to (Left  l) = 0 + 2 * to l
    to (Right r) = 1 + 2 * to r
    from n = case n `quotRem` 2 of
        (l, 0) -> Left  (from l)
        (r, 1) -> Right (from r)

With that in place, the instances for our types are pretty easy.

monomorph :: Either a a -> Either a a
monomorph = id

toColored :: Godel v => (Color, v) -> Nat
toColored (Black, v) = to (monomorph (Left  v))
toColored (White, v) = to (monomorph (Right v))

fromColored :: Godel v => Nat -> (Color, v)
fromColored n = case from n of
    Left  v -> (Black, v)
    Right v -> (White, v)

instance Godel Odd where
    to (Evens c l r) = 0 + 2 * toColored (c, (l, r))
    to (Odds  c l r) = 1 + 2 * toColored (c, (l, r))
    from n = case n `quotRem` 2 of
        (clr, 0) -> Evens c l r where (c, (l, r)) = fromColored clr
        (clr, 1) -> Odds  c l r where (c, (l, r)) = fromColored clr

instance Godel Even where
    to Tip = 0
    to (OddL c l r) = 1 + 2 * toColored (c, (l, r))
    to (OddR c l r) = 2 + 2 * toColored (c, (l, r))
    from 0 = Tip
    from n = case (n-1) `quotRem` 2 of
        (clr, 0) -> OddL c l r where (c, (l, r)) = fromColored clr
        (clr, 1) -> OddR c l r where (c, (l, r)) = fromColored clr

And that's pretty much it. Now you've got your bijection with the naturals, and you can pick your favorite bijection between the naturals and bitstreams to postcompose with.

Community
  • 1
  • 1
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • @dfeuer Your proposed instance is not onto (e.g. `5` is not the image of any pair). – Daniel Wagner Mar 26 '16 at 22:58
  • I just realized it should be really easy to encode using little-endian naturals designed to ensure there are no extra zeros. `newtype Natural = Nat 'True` `data Nat mayzero where Zero :: Nat 'False -> Nat may; One :: Nat may -> Nat be; End :: Nat 'True`. Pairs can be formed by interleaving. – dfeuer Mar 28 '16 at 19:09
  • @dfeuer Interleaving is trickier than you may expect. You need to know when the shorter sequence ends. See also [the link from my previous answer](http://www.sciencedirect.com/science/article/pii/002200009290027G) for a discussion of the subtle art of pairing functions, and for the definition of a particularly clean and streamable pairing function. – Daniel Wagner Mar 28 '16 at 19:13
  • That's an interesting paper. Thanks. My aim with the interleaving approach wasn't *efficiency*, as theirs is, but rather *correctness and simplicity*, which no approach using integer square roots is likely to attain simultaneously. – dfeuer Mar 29 '16 at 01:59
  • @dfeuer Your goals are admirable, but your solution as I understand it is not correct. Their goals may differ from yours, but their solution meets your criteria of being both clearly correct and very simple. (I avoid it only because its implementation is less convenient on `Integer`s than it is on lists of bits -- which means either an inconveniently long implementation or the introduction of glue code. Either of those would distract from the main idea.) – Daniel Wagner Mar 29 '16 at 04:01
  • Grrr. I see now why my interleaving approach fails. I think there must be some way to make one work, but I guess it will require much more thought and care. – dfeuer Mar 29 '16 at 04:08