15

Okasaki describes persistent real-time queues which can be realized in Haskell using the type

data Queue a = forall x . Queue
  { front :: [a]
  , rear :: [a]
  , schedule :: [x]
  }

where incremental rotations maintain the invariant

length schedule = length front - length rear

More details

If you're familiar with the queues involved, you can skip this section.

The rotation function looks like

rotate :: [a] -> [a] -> [a] -> [a]
rotate [] (y : _) a = y : a
rotate (x : xs) (y : ys) a =
  x : rotate xs ys (y : a)

and it's called by a smart constructor

exec :: [a] -> [a] -> [x] -> Queue a
exec f r (_ : s) = Queue f r s
exec f r [] = Queue f' [] f' where
  f' = rotate f r []

after each queue operation. The smart constructor is always called when length s = length f - length r + 1, ensuring that the pattern match in rotate will succeed.

The problem

I hate partial functions! I'd love to find a way to express the structural invariant in the types. The usual dependent vector seems a likely choice:

data Nat = Z | S Nat

data Vec n a where
  Nil :: Vec 'Z a
  Cons :: a -> Vec n a -> Vec ('S n) a

and then (perhaps)

data Queue a = forall x rl sl . Queue
  { front :: Vec (sl :+ rl) a
  , rear :: Vec rl a
  , schedule :: Vec sl x
  }

The trouble is that I haven't been able to figure out how to juggle the types. It seems extremely likely that some amount of unsafeCoerce will be needed to make this efficient. However, I haven't been able to come up with an approach that's even vaguely manageable. Is it possible to do this nicely in Haskell?

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • I don't think I understand this queue type - what would the types of `exec` and `rotate` be with the vector implementation? – user2407038 Apr 14 '16 at 00:59
  • 1
    What does `forall x . { ... ; schedule :: Vec sl x }` give you? It seems to me `schedule` is essentially a natural number, because the only thing you know about it is its length, since its contents is existentially quantified. So the type of `schedule` probably should be `Sing sl`. – effectfully Apr 14 '16 at 01:00
  • 2
    @user3237465, `schedule` is indeed representing a natural number (and I introduced `x` specifically to make sure it's used *only* as a natural number), but it's actually a portion of the front list, so pattern matching on it drives incremental calculation of that list. – dfeuer Apr 14 '16 at 01:03
  • @user2407038, the type of `exec` could be expressed in various ways, depending on what works best, but one valid expression, I believe, is `Vec (sl1 :+ rl) a -> Vec rl a -> Vec ('S sl1) a -> Queue a` – dfeuer Apr 14 '16 at 01:08
  • So did you get things working? BTW, you should probably accept [the Alec's answer](http://stackoverflow.com/a/40252235/3237465) as it's clearly more informative. – effectfully Oct 26 '16 at 16:42

2 Answers2

7

Here is what I got:

open import Function
open import Data.Nat.Base
open import Data.Vec

grotate : ∀ {n m} {A : Set}
        -> (B : ℕ -> Set)
        -> (∀ {n} -> A -> B n -> B (suc n))
        -> Vec A n
        -> Vec A (suc n + m)
        -> B m
        -> B (suc n + m)
grotate B cons  []      (y ∷ ys) a = cons y a
grotate B cons (x ∷ xs) (y ∷ ys) a = grotate (B ∘ suc) cons xs ys (cons y a)

rotate : ∀ {n m} {A : Set} -> Vec A n -> Vec A (suc n + m) -> Vec A m -> Vec A (suc n + m)
rotate = grotate (Vec _) _∷_

record Queue (A : Set) : Set₁ where
  constructor queue
  field
    {X}      : Set
    {n m}    : ℕ
    front    : Vec A (n + m)
    rear     : Vec A m
    schedule : Vec X n

open import Relation.Binary.PropositionalEquality
open import Data.Nat.Properties.Simple

exec : ∀ {m n A} -> Vec A (n + m) -> Vec A (suc m) -> Vec A n -> Queue A
exec {m} {suc n} f r (_ ∷ s) = queue (subst (Vec _) (sym (+-suc n m)) f) r s
exec {m}         f r  []     = queue (with-zero f') [] f' where
 with-zero    = subst (Vec _ ∘ suc) (sym (+-right-identity m))
 without-zero = subst (Vec _ ∘ suc) (+-right-identity m)

 f' = without-zero (rotate f (with-zero r) [])

rotate is defined in terms of grotate for the same reason reverse is defined in terms of foldl (or enumerate in terms of genumerate): because Vec A (suc n + m) is not definitionally Vec A (n + suc m), while (B ∘ suc) m is definitionally B (suc m).

exec has the same implementation as you provided (modulo those substs), but I'm not sure about the types: is it OK that r must be non-empty?

Community
  • 1
  • 1
effectfully
  • 12,325
  • 2
  • 17
  • 40
  • Wowzers. That `grotate` bit is going to take some thinking, since I have very little experience with this sort of programming. Is `subst` some sort of operation for pushing equalities into expressions? The restriction on `r` could be annoying, and that was my fault for a bad comment. Sorry about that. I doubt it would be horrible, but probably annoying. – dfeuer Apr 14 '16 at 02:07
  • 1
    @dfeuer, yes, `subst` rewrites the type of a term by an equation. In your comment it's a `schedule` must be non-empty (a typo?), while in mine it's a `rear`. That's just what other types enforced me to do. BTW, [here](https://gist.github.com/gallais/7bcb261f97dd395aa9d2) is an example of how `foldl` can be translated to Haskell, you'll probably need to do a similar thing to translate `grotate`. – effectfully Apr 14 '16 at 02:20
  • I'm sorry for failing to award the bounty! I had no cell service for two days. – dfeuer Nov 02 '16 at 00:45
  • @dfeuer, no worries, the other answer deserves it better (even though it got only half of it). – effectfully Nov 02 '16 at 05:53
3

The other answer is super clever (please take a moment to upvote it), but as someone not familiar with Agda, how this would be implemented in Haskell was not obvious to me. Here's a full Haskell version. We'll need a whole slew of extensions, as well as Data.Type.Equality (since we will need to do some limited amount of type-proofs).

{-# LANGUAGE GADTs, ScopedTypeVariables,RankNTypes,
             TypeInType, TypeFamilies, TypeOperators #-}

import Data.Type.Equality

Defining Nat, Vec, and Queue

Next, we define the usual type-level natural numbers (this looks like just a regular data definition, but because we have TypeInType enabled, it will get automatically promoted when we use it in a type) and a type function (a type family) for addition. Note that although there are multiple ways of defining +, our choice here will impact what follows. We'll also define the usual Vec which is very much like a list except that it encodes its length in the phantom type n. With that, we can go ahead and define the type of our queue.

data Nat = Z | S Nat

type family n + m where
    Z   + m = m
    S n + m = S (n + m)

data Vec a n where
    Nil   :: Vec a Z
    (:::) :: a -> Vec a n -> Vec a (S n)

data Queue a where
    Queue :: { front :: Vec a (n + m)
             , rear :: Vec a m
             , schedule :: Vec x n } -> Queue a

Defining rotate

Now, things start to get hairier. We want to define a function rotate that has type rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m), but you quickly run into a variety of proof related problems with just defining this recursively. The solution is instead to define a slightly more general grotate, which can be defined recursively, and for which rotate is a special case.

The point of Bump is to circumvent the fact that there is no such thing as type level composition in Haskell. There is no way of writing things an operator like (∘) such that (S ∘ S) x is S (S x). The workaround is to continuously wrap/unwrap with Bump/lower.

newtype Bump p n = Bump { lower :: p (S n) }

grotate :: forall p n m a.
           (forall n. a -> p n -> p (S n)) ->
           Vec a n ->
           Vec a (S n + m) ->
           p m ->
           p (S n + m)
grotate cons Nil        (y ::: _)  zs = cons y zs
grotate cons (x ::: xs) (y ::: ys) zs = lower (grotate consS xs ys (Bump (cons y zs))) 
  where
    consS :: forall n. a -> Bump p n -> Bump p (S n)
    consS = \a -> Bump . cons a . lower 

rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m)
rotate = grotate (:::)

We need explicit foralls here to make it very clear which type variables are getting captured and which aren't, as well as to denote higher-rank types.

Singleton natural numbers SNat

Before we proceed to exec, we set up some machinery that will allow us to prove some type-level arithmetic claims (which we need to get exec to typecheck). We start by making an SNat type (which is a singleton type corresponding to Nat). SNat reflects its value in a phantom type variable.

data SNat n where
  SZero :: SNat Z
  SSucc :: SNat n -> SNat (S n)

We can then make a couple useful functions to do things with SNat.

sub1 :: SNat (S n) -> SNat n
sub1 (SSucc x) = x

size :: Vec a n -> SNat n
size Nil = SZero
size (_ ::: xs) = SSucc (size xs)

Finally, we are prepared to prove some arithmetic, namely that n + S m ~ S (n + m) and n + Z ~ n.

plusSucc :: (SNat n) -> (SNat m) -> (n + S m) :~: S (n + m)
plusSucc SZero _ = Refl
plusSucc (SSucc n) m = gcastWith (plusSucc n m) Refl

plusZero :: SNat n -> (n + Z) :~: n
plusZero SZero = Refl
plusZero (SSucc n) = gcastWith (plusZero n) Refl 

Defining exec

Now that we have rotate, we can define exec. This definition looks almost identical to the one in the question (with lists), except annotated with gcastWith <some-proof>.

exec :: Vec a (n + m) -> Vec a (S m) -> Vec a n -> Queue a
exec f r (_ ::: s) = gcastWith (plusSucc (size s) (sub1 (size r))) $ Queue f r s
exec f r Nil       = gcastWith (plusZero (sub1 (size r))) $
  let f' = rotate f r Nil in (Queue f' Nil f')

It is probably worth noting that we can get some stuff for free by using singletons. With the right extensions enabled, the following more readable code

import Data.Singletons.TH 

singletons [d|
    data Nat = Z | S Nat

    (+) :: Nat -> Nat -> Nat
    Z   + n = n
    S m + n = S (m + n)
  |]

defines, Nat, the type family :+ (equivalent to my +), and the singleton type SNat (with constructors SZ and SS equivalent to my SZero and SSucc) all in one.

Community
  • 1
  • 1
Alec
  • 31,829
  • 7
  • 67
  • 114
  • This is very helpful indeed. I worked out the Agda `foldl` translation myself, but hadn't yet attempted this one. Note that `\a -> Bump . cons a . lower = (Bump .) . (. lower) . cons = ((Bump .) . (. lower)) #. cons`, which should be a free coercion. That leaves just `plusZero` and `plusSucc` to be optimized with unsafe coercions. – dfeuer Oct 30 '16 at 00:18