13

I am trying to design an API for some database system in Haskell, and I would like to model the columns of this database in a way such that interactions between columns of different tables cannot get mixed up.

More precisely, imagine that you have a type to represent a table in a database, associated to some type:

type Table a = ...

and that you can extract the columns of the table, along with the type of the column:

type Column col = ...

Finally, there are various extractors. For example, if your table contains descriptions of frogs, a function would let you extract the column containing the weight of the frog:

extractCol :: Table Frog -> Column Weight

Here is the question: I would like to distinguish the origin of the columns so that users cannot do operations between tables. For example:

bullfrogTable = undefined :: Table Frog
toadTable = undefined :: Table Frog
bullfrogWeights = extractCol bullfrogTable
toadWeights = extractCol toadTable
-- Or some other columns from the toad table
toadWeights' = extractCol toadTable
-- This should compile
addWeights toadWeights' toadWeights
-- This should trigger a type error
addWeights bullfrogWeights toadWeights

I know how to achieve this in Scala (using path-dependent types, see [1]), and I have been thinking of 3 options in Haskell:

  • not using types, and just doing a check at runtime (the current solution)

  • the TypeInType extension to add a phantom type on the Table type itself, and pass this extra type to the columns. I am not keen on it, because the construction of such a type would be very complicated (tables are generated through complex DAG operations) and probably slow to compile in this context.

  • wrapping the operations using a forall construct similar to the ST monad, but in my case, I would like the extra tagging type to actually escape the construction.

I am happy to have a very limited valid scoping for the construction of the same columns (i.e. columns from table and (id table) not being mixable), and I mostly care about the DSL feel of the API rather than the safety.

[1] What is meant by Scala's path-dependent types?

My current solution

Here is what I ended up doing, using RankNTypes.

I still want to give users the ability to use columns how they see fit without having some strong type checks, and opt in if they want some stronger type guarantees: this is a DSL for data scientist who will not know the power of the Haskell side

Tables are still tagged by their content:

type Table a = ...

and columns are now tagged with some extra reference types, on top of the type of the data they contain:

type Column ref col = ...

Projections from tables to columns are either tagged or untagged. In practice, this is hidden behind a lens-like DSL.

extractCol :: Table Frog -> Column Frog Weight

data TaggedTable ref a = TaggedTable { _ttTable :: Table a }

extractColTagged :: Table ref Frog -> Column ref Weight

withTag :: Table a -> (forall ref. TaggedTable ref a -> b) -> b
withTag tb f = f (TaggedTable tb)

Now I can write some code as following:

let doubleToadWeights = withTag toadTable $ \ttoadTable ->
  let toadWeights = extractColTagged ttoadTable in
    addWeights toadWeights toadWeights

and this will not compile, as desired:

let doubleToadWeights =
  toadTable `withTag` \ttoads ->
     bullfrogTable `withTag` \tbullfrogs ->
       let toadWeights = extractColTagged ttoads
           bullfrogWeights = extractColTagged tbullfrogs
       in addWeights toadWeights bullfrogWeights -- Type error

From a DSL perspective, I believe it is not as straightforward as what one could achieve with Scala, but the type error message is understandable, which is paramount for me.

Community
  • 1
  • 1
T. Hunter
  • 139
  • 4
  • You seem to have a pretty good lay of the land in your listed options. The only thing that comes to mind is [`reflection`](https://hackage.haskell.org/package/reflection) which you may be able to get to work for you with some cleverness. – luqui Oct 29 '16 at 11:08
  • Yes, I was looking at the `reflection` package. My issue is that the signature of `reify` is essentially closed over the type being created, i.e. it does not escape. I am not familiar enough with the Haskell type system to understand if you can create a type out of a value (which sounds like a heresy in Haskell, and only allowed through a special compiler construct in Scala). – T. Hunter Oct 29 '16 at 11:29
  • 1
    Nothing like Scala's path-dependent types are available in Haskell. The "non-`fast`" reflection variant contains a mechanism by which you can a type representation of a value, but it's messy and your error messages would be horrid. The only way to create a type with "intensional identity" (i.e. not equal to any other type) is through a scoped quantifier like `reify` or `runST`. I would begin looking for mixed approaches, e.g. go with your option #2 but create a simplified phantom type that does not capture everything, combined with selective manual labeling or something. – luqui Oct 29 '16 at 11:37
  • 1
    Of course if you use template haskell your options widen. E.g. you could generate a fresh top-level type for each table declaration and then use that to tag the columns. I think you could get a minimal approximation of path-dependent types that way, actually. – luqui Oct 29 '16 at 11:52
  • `add a phantom type on the Table type itself, and pass this extra type to the columns` Why do the tables need to have a phantom type? Perhaps it would be enough if the columns where tagged with the phantom type of its origin table. – danidiaz Oct 30 '16 at 07:42
  • @danidiaz yes this is what I ended up doing, although it requires the introduction of a separate type. You can see what I eventually came up with in the edited answer. – T. Hunter Nov 04 '16 at 09:25
  • @luqui thanks for your comments, I ended doing something as you suggested. I agree that template haskell would be nice, but it seems to have to many other downsides (compilation times, weird corner cases, lack of understandability) in this case. – T. Hunter Nov 04 '16 at 09:26

1 Answers1

1

Haskell does not (as far as I know) have path dependent types, but you can get some of the way by using rank 2 types. For instance the ST monad has a dummy type parameter s that is used to prevent leakage between invocations of runST:

runST :: (forall s . ST s a) -> a

Within an ST action you can have an STRef:

newSTRef :: a -> ST s (STRef s a) 

But the STRef you get carries the s type parameter, so it isn't allowed to escape from the runST.

Paul Johnson
  • 17,438
  • 3
  • 42
  • 59