3

I'm trying to implement a Cholesky decomposition in Halide. Part of common algorithm such as crout consists of an iteration over a triangular matrix. In a way that, the diagonal elements of the decomposition are computed by subtracting a partial column sum from the diagonal element of the input matrix. Column sum is calculated over squared elements of a triangular part of the input matrix, excluding the diagonal element.

Using BLAS the code would in C++ look as follows:

double* a; /* input matrix */
int n; /* dimension */
const int c__1 = 1;
const double c_b12 = 1.;
const double c_b10 = -1.;

for (int j = 0; j < n; ++j) {
    double ajj = a[j + j * n] - ddot(&j, &a[j + n], &n, &a[j + n], &n);
    ajj = sqrt(ajj);
    a[j + j * n] = ajj;
    if (j < n) {
            int i__2 = n - j;
            dgemv("No transpose", &i__2, &j, &c_b10, &a[j + 1 + n], &n, &a[j + n], &b, &c_b12, &a[j + 1 + j * n], &c__1);
            double d__1 = 1. / ajj;
            dscal(&i__2, &d__1, &a[j + 1 + j * n], &c__1);            
    }
}

My question is if a pattern like this is in general expressible by Halide? And if so, how would it look like?

Pablo R.
  • 31
  • 3

2 Answers2

1

I think Andrew may have a more complete answer, but in the interest of a timely response, you can use an RDom predicate (introduced via RDom::where) to enumerate triangular regions (or their generalization to more dimensions). A sketch of the pattern is:

Halide::RDom triangular(0, extent, 0, extent);
triangular.where(triangular.x < triangular.y);

Then use triangular in a reduction.

Zalman Stern
  • 3,161
  • 12
  • 18
0

I once had a fast Cholesky written in Halide. Unfortunately I can't find the code. I put the outer loop in C and wrote a good block-panel update routine that operated on something like a 32-wide panel at a time. This was before Halide had triangular iteration, so maybe you can do better now.

Andrew Adams
  • 1,396
  • 7
  • 3