30

My program (Hartree-Fock/iterative SCF) has two matrices F and F' which are really the same matrix expressed in two different bases. I just lost three hours of debugging time because I accidentally used F' instead of F. In C++, the type-checker doesn't catch this kind of error because both variables are Eigen::Matrix<double, 2, 2> objects.

I was wondering, for the Haskell/ML/etc. people, whether if you were writing this program you would have constructed a type system where F and F' had different types? What would that look like? I'm basically trying to get an idea how I can outsource some logic errors onto the type checker.

Edit: The basis of a matrix is like the unit. You can say 1L or however many gallons, they both mean the same thing. Or, to give a vector example, you can say (0,1) in Cartesian coordinates or (1,pi/2) in polar. But even though the meaning is the same, the numerical values are different.

Edit: Maybe units was the wrong analogy. I'm not looking for some kind of record type where I can specify that the first field will be litres and the second gallons, but rather a way to say that this matrix as a whole, is defined in terms of some other matrix (the basis), where the basis could be any matrix of the same dimensions. E.g., the constructor would look something like mkMatrix [[1, 2], [3, 4]] [[5, 6], [7, 8]] and then adding that object to another matrix would type-check only if both objects had the same matrix as their second parameters. Does that make sense?

Edit: definition on Wikipedia, worked examples

Wang
  • 3,247
  • 1
  • 21
  • 33
  • 1
    what's the "bases of a matrix" - perhaps you could give a short explanation? – phynfo May 01 '11 at 18:28
  • Have you looked into the Boost.Units library for inspiration? It may give you some ideas. But in short, as long as the basis can be expressed either as a type tag or a constant integral value, you can just add it as an extra template parameter to your matrix class. Then you just have to define suitable conversion functions to convert between bases – jalf May 01 '11 at 18:37

5 Answers5

20

This is entirely possible in Haskell.

Statically checked dimensions

Haskell has arrays with statically checked dimensions, where the dimensions can be manipulated and checked statically, preventing indexing into the wrong dimension. Some examples:

This will only work on 2-D arrays:

multiplyMM :: Array DIM2 Double -> Array DIM2 Double -> Array DIM2 Double

An example from repa should give you a sense. Here, taking a diagonal requires a 2D array, returns a 1D array of the same type.

diagonal :: Array DIM2 e -> Array DIM1 e

or, from Matt sottile's repa tutorial, statically checked dimensions on a 3D matrix transform:

f :: Array DIM3 Double -> Array DIM2 Double
f u =
  let slabX = (Z:.All:.All:.(0::Int))
      slabY = (Z:.All:.All:.(1::Int))
      u' = (slice u slabX) * (slice u slabX) +
           (slice u slabY) * (slice u slabY)
  in
    R.map sqrt u'

Statically checked units

Another example from outside of matrix programming: statically checked units of dimension, making it a type error to confuse e.g. feet and meters, without doing the conversion.

 Prelude> 3 *~ foot + 1 *~ metre
 1.9144 m

or for a whole suite of SI units and quanities.

E.g. can't add things of different dimension, such as volumes and lengths:

> 1 *~ centi litre + 2 *~ inch 
Error:
Expected type: Unit DVolume a1
  Actual type: Unit DLength a0

So, following the repa-style array dimension types, I'd suggest adding a Base phantom type parameter to your array type, and using that to distinguish between bases. In Haskell, the index Dim type argument gives the rank of the array (i.e. its shape), and you could do similarly.

Or, if by base you mean some dimension on the units, using dimensional types.

So, yep, this is almost a commodity technique in Haskell now, and there's some examples of designing with types like this to help you get started.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 2
    Statically checked dimensions is no different to the template he's already using in C++. – Puppy May 01 '11 at 18:47
  • 2
    The statically checked units of dimension is almost exactly what I want, except instead of meters and feet my "units" would themselves be matrices. – Wang May 01 '11 at 18:48
  • 1
    Great. An example `let v = Data.Vector.replicate 10 (1 *~ litre)` is a 1d vector of volume quanities, whose type is inferred as `Vector (Quantity DVolume Double)` – Don Stewart May 01 '11 at 19:03
  • Can you link to some examples of this? – Tom Crayford May 01 '11 at 20:46
  • Here you go. (Thanks for being patient, by the way; I'm not explaining this well.) – Wang May 01 '11 at 20:56
  • His units analogy is not exactly right; this answer doesn't encode the basis of the linear transformation. DeadMG is also right to point out he's already doing this in C++. See my response. – gatoatigrado May 01 '11 at 22:09
  • @Wang: In C++ I would use template reduction to statically reduce the matrix into a canonical form (Note: it's been done in C++0x with the `` header). I am not sufficiently advanced in Haskell to produce this, but it seems to me it should be possible using dependent types. – Matthieu M. May 02 '11 at 07:42
18

This is a very good question. I don't think you can encode the notion of a basis in most type systems, because essentially anything that the type checker does needs to be able to terminate, and making judgments about whether two real-valued vectors are equal is too difficult. You could have (2 v_1) + (2 v_2) or 2 (v_1 + v_2), for example. There are some languages which use dependent types [ wikipedia ], but these are relatively academic.

I think most of your debugging pain would be alleviated if you simply encoded the bases in which you matrix works along with the matrix. For example,

newtype Matrix = Matrix { transform :: [[Double]], 
    srcbasis :: [Double], dstbasis :: [Double] }

and then, when you M from basis a to b with N, check that N is from b to c, and return a matrix with basis a to c.

NOTE -- it seems most people here have programming instead of math background, so I'll provide short explanation here. Matrices are encodings of linear transformations between vector spaces. For example, if you're encoding a rotation by 45 degrees in R^2 (2-dimensional reals), then the standard way of encoding this in a matrix is saying that the standard basis vector e_1, written "[1, 0]", is sent to a combination of e_1 and e_2, namely [1/sqrt(2), 1/sqrt(2)]. The point is that you can encode the same rotation by saying where different vectors go, for example, you could say where you're sending [1,1] and [1,-1] instead of e_1=[1,0] and e_2=[0,1], and this would have a different matrix representation.

Edit 1

If you have a finite set of bases you are working with, you can do it...

{-# LANGUAGE EmptyDataDecls #-}
data BasisA
data BasisB
data BasisC

newtype Matrix a b = Matrix { coefficients :: [[Double]] }

multiply :: Matrix a b -> Matrix b c -> Matrix a c
multiply (Matrix a_coeff) (Matrix b_coeff) = (Matrix multiplied) :: Matrix a c
    where multiplied = undefined -- your algorithm here

Then, in ghci (the interactive Haskell interpreter),

*Matrix> let m = Matrix [[1, 2], [3, 4]] :: Matrix BasisA BasisB
*Matrix> m `multiply` m

<interactive>:1:13:
    Couldn't match expected type `BasisB'
        against inferred type `BasisA'
*Matrix> let m2 = Matrix [[1, 2], [3, 4]] :: Matrix BasisB BasisC
*Matrix> m `multiply` m2
-- works after you finish defining show and the multiplication algorithm
gatoatigrado
  • 16,580
  • 18
  • 81
  • 143
  • Ah, yes, the n-tuple example of a dependent type in your Wikipedia link is exactly what I'd be trying to achieve. Oh, it's always sad to find that the latest neat thing I've heard of won't actually solve all my problems. – Wang May 01 '11 at 22:37
  • Haha, I know what you mean. If performance is critical though, you may consider a code generator route. Essentially, one pass of your code "compiles" / generates the multiplication (does basis checking, etc.), and another pass actually does the multiplication. While I can't see how a 1-D basis check could really take that much time, it's a useful general strategy that better languages like Haskell can make manageable. – gatoatigrado May 02 '11 at 00:13
11

While I realize this does not strictly address the (clarified) question – my apologies – it seems relevant at least in relation to Don Stewart's popular answer...

I am the author of the Haskell dimensional library that Don referenced and provided examples from. I have also been writing – somewhat under the radar – an experimental rudimentary linear algebra library based on dimensional. This linear algebra library statically tracks the sizes of vectors and matrices as well as the physical dimensions ("units") of their elements on a per element basis.

This last point – tracking physical dimensions on a per element basis – is rather challenging and perhaps overkill for most uses, and one could even argue that it makes little mathematical sense to have quantities of different physical dimensions as elements in any given vector/matrix. However, some linear algebra applications of interest to me such as kalman filtering and weighted least squares estimation typically use heterogeneous state vectors and covariance matrices.

Using a Kalman filter as an example, consider a state vector x = [d, v] which has physical dimensions [L, LT^-1]. The next (future) state vector is predicted by multiplication by the state transition matrix F, i.e.: x' = F x_. Clearly for this equation to make sense F cannot be arbitrary but must have size and physical dimensions [[1, T], [T^-1, 1]]. The predict_x' function below statically ensures that this relationship holds:

predict_x' :: (Num a, MatrixVector f x x) => Mat f a -> Vec x a -> Vec x a
predict_x' f x_ = f |*< x_

(The unsightly operator |*< denotes multiplication of a matrix on the left with a vector on the right.)

More generally, for an a priori state vector x_ of arbitrary size and with elements of arbitrary physical dimensions, passing a state transition matrix f with "incompatible" size and/or physical dimensions to predict_x' will cause a compile time error.

6

In F# (which originally evolved from OCaml), you can use units of measure. Andrew Kenned, who designed the feature (and also created a very interesting theory behind it) has a great series of articles that demonstrate it.

This can quite likely be used in your scenario - although I don't fully understand the question. For example, you can declare two unit types like this:

[<Measure>] type litre
[<Measure>] type gallon

Adding litres and gallons gives you a compile time error:

1.0<litre> + 1.0<gallon> // Error!

F# doesn't automatically insert conversion between different units, but you can write a conversion function:

let toLitres gal = gal * 3.78541178<litre/gallon>
1.0<litre> + (toLitres 1.0<gallon>)

The beautiful thing about units of measure in F# is that they are automatically inferred and functions are generic. If you multiply 1.0<gallon> * 1.0<gallon>, the result is 1.0<gallon^2>.

People have used this feature for various things - ranging from conversion of virtual meters to screen pixels (in solar system simulations) to converting currencies (dollars in financial systems). Although I'm not expert, it is quite likely that you could use it in some way for your problem domain too.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Whoa, that's amazing! Can you define arbitrary units, though? E.g., if atomic units weren't already in the library, could I define a Hartree as such-and-such many Joules? None of the examples I've found (via an admittedly cursory Google query) involve constants on the derived units. – Wang May 01 '11 at 21:03
  • Ah, so you could define the Hartree as a new unit, and manually supply a conversion function, but you could not define a derived unit that was a fraction of a Joule and expect F# to infer the conversion. Still pretty cool. Have an upvote, sir! – Wang May 01 '11 at 21:05
  • 1
    Pretty much exactly the same as the Haskell `dimensional` library described above. E.g. `(1 *~ metre) * (2 *~ foot)` == `0.6096000000000001 m^2`, which has two-dimensional type. – Don Stewart May 01 '11 at 21:08
  • @Don Does the Haskell library also infer types of functions that are polymorphic in units? For example, the type of `toLitres` above is `float<'u> -> float<'u litre/galon>` (and when you use it, units are simplified). – Tomas Petricek May 01 '11 at 23:40
  • Yep, since its just using regular Haskell type classes -- so inference works just fine. E.g. the type above for `m^2` was inferred. – Don Stewart May 01 '11 at 23:48
5

If it's expressed in a different base, you can just add a template parameter to act as the base. That will differentiate those types. A float is a float is a float- if you don't want two float values to be the same if they actually have the same value, then you need to tell the type system about it.

Puppy
  • 144,682
  • 38
  • 256
  • 465