Template function overloading and SFINAE implementations

Question

I'm spending some time in learning how to use templates in C++. I never used them before and I'm not always sure what can be or what cannot be achieved in different situation.

As an exercise I'm wrapping some of the Blas and Lapack functions that I use for my activities, and I'm currently working on the wrapping of ?GELS (that evaluates the solution of a linear set of equations).

 A x + b = 0

?GELS function (for real values only) exists with two names: SGELS, for single precision vectors and DGELS for double precision.

My idea of interface is a function solve in this way:

 const std::size_t rows = /* number of rows for A */;
 const std::size_t cols = /* number of cols for A */;
 std::array< double, rows * cols > A = { /* values */ };
 std::array< double, ??? > b = { /* values */ };  // ??? it can be either
                                                  // rows or cols. It depends on user
                                                  // problem, in general
                                                  // max( dim(x), dim(b) ) =
                                                  // max( cols, rows )     
 solve< double, rows, cols >(A, b);
 // the solution x is stored in b, thus b 
 // must be "large" enough to accommodate x

Depending on user requirements, the problem may be overdetermined or undetermined, that means:

if it is overdetermined dim(b) > dim(x) (the solution is a pseudo-inverse)
if it is undetermined dim(b) < dim(x) (the solution is a LSQ minimization)
or the normal case in which dim(b) = dim(x) (the solution is the inverse of A)

(without considering singular cases).

Since ?GELS stores the result in the input vector b, the std::array should have enough space to accommodate the solution, as described in code comments (max(rows, cols)).

I want to (compile time) determine which kind of solution to adopt (it is a parameter change in ?GELS call). I have two functions (I'm simplifying for the sake of the question), that handle the precision and already know which is the dimension of b and the number of rows/cols:

namespace wrap {

template <std::size_t rows, std::size_t cols, std::size_t dimb>
void solve(std::array<float, rows * cols> & A, std::array<float, dimb> & b) {
  SGELS(/* Called in the right way */);
}

template <std::size_t rows, std::size_t cols, std::size_t dimb>
void solve(std::array<double, rows * cols> & A, std::array<double, dimb> & b) {
  DGELS(/* Called in the right way */);
}

}; /* namespace wrap */

that are part of an internal wrapper. The user function, determine the size required in the b vector through templates:

#include <type_traits>

/** This struct makes the max between rows and cols */
template < std::size_t rows, std::size_t cols >
struct biggest_dim {
  static std::size_t const value = std::conditional< rows >= cols, std::integral_constant< std::size_t, rows >,
                                                     std::integral_constant< std::size_t, cols > >::type::value;
};

/** A type for the b array is selected using "biggest_dim" */
template < typename REAL_T, std::size_t rows, std::size_t cols >
using b_array_t = std::array< REAL_T, biggest_dim< rows, cols >::value >;

/** Here we have the function that allows only the call with b of
 *  the correct size to continue */
template < typename REAL_T, std::size_t rows, std::size_t cols >
void solve(std::array< REAL_T, cols * rows > & A, b_array_t< REAL_T, cols, rows > & b) {
  static_assert(std::is_floating_point< REAL_T >::value, "Only float/double accepted");
  wrap::solve< rows, cols, biggest_dim< rows, cols >::value >(A, b);
}

In this way it actually works. But I want to go one step further, and I really don't have a clue on how to do it. If the user tries to call solve with b of a size that is too small an extremely difficult-to-read error is raised by the compiler.

I'm trying to insert a static_assert that helps the user to understand his error. But any direction that comes in my mind requires the use of two function with the same signature (it is like a template overloading?) for which I cannot find a SFINAE strategy (and they actually do not compile at all).

Do you think it is possible to raise a static assertion for the case of wrong b dimension without changing the user interface at compile time? I hope the question is clear enough.

@Caninonos: For me the user interface is how the user calls the solver, that is:

 solve< type, number of rows, number of cols > (matrix A, vector b)

This is a constraint that I put on my exercise, in order to improve my skills. That means, I don't know if it is actually possible to achieve the solution. The type of b must match the function call, and it is easy if I add another template parameter and I change the user interface, violating my constraint.

Minimal complete and working example

This is a minimal complete and working example. As requested I removed any reference to linear algebra concepts. It is a problem of number. The cases are:

N1 = 2, N2 =2. Since N3 = max(N1, N2) = 2 everything works
N1 = 2, N2 =1. Since N3 = max(N1, N2) = N1 = 2 everything works
N1 = 1, N2 =2. Since N3 = max(N1, N2) = N2 = 2 everything works
N1 = 1, N2 =2. Since N3 = N1 = 1 < N2 it correctly raises a compilation error. I want to intercept the compilation error with a static assertion that explains the fact that the dimension of N3 is wrong. As for now the error is difficult to read and understand.

You can view and test it online here

I'm not really sure I understand what you mean by "changing the user interface at compile time", but maybe you simply want a `static_assert(dimb == biggest_dim< rows, cols >::value, "msg")` in your first versions of `solve` ? — Caninonos, Mar 08 '18 at 11:37
@Caninonos at first I thought it cannot works, but instead is a good idea.I have only to change my way of looking at the problem... — Matteo Ragni, Mar 08 '18 at 11:42
@Caninonos actually I'm not sure how to implement it though... I'll better explain what I mean with not changing the user interface — Matteo Ragni, Mar 08 '18 at 11:44
I don't quite understand `max( dim(x), dim(b) ) == max( cols, rows )`? Why not simply require that `b` is, say, in column format just use `dim(b) == rows`? — WorldSEnder, Mar 08 '18 at 12:05
@WorldSEnder there is an equality `=`, not a comparison. The size of the vector `x` is the number of columns of `A`, while the size of the vector `b` is the number of rows of `A`. But `?GELS` stores the solution of the linear problem (the values of `x`) in the memory "allocated" for `b`. That means that the size of `b` must be the `max(rows, cols)`. Youcan check on the reference: http://www.netlib.org/lapack/explore-html/d7/d3b/group__double_g_esolve_ga225c8efde208eaf246882df48e590eac.html#ga225c8efde208eaf246882df48e590eac — Matteo Ragni, Mar 08 '18 at 12:12
Can you reduce this problem to its minimal template issues? Right now it seems mired down by all those linear algebra — Passer By, Mar 08 '18 at 12:19
C++ language version? It makes a difference in this case. C++11 C++14 or C++17? higher is better :) — bolov, Mar 08 '18 at 12:54

score 2 · Accepted Answer · edited Oct 31 '21 at 20:10

2

First some improvements that simplify the design a bit and help readability:

there is no need for biggest_dim. std::max is constexpr since C++14. You should use it instead.
there is no need for b_array_t. You can just write std::array< REAL_T, std::max(N1, N2)>

And now to your problem. One nice way in C++17 is:

template < typename REAL_T, std::size_t N1, std::size_t N2, std::size_t N3>
void solve(std::array< REAL_T, N1 * N2 > & A, std::array< REAL_T, N3> & b) {

    if constexpr (N3 == std::max(N1, N2))
        wrap::internal< N1, N2, N3 >(A, b);
    else
        static_assert(N3 == std::max(N1, N2), "invalid 3rd dimension");

        // don't write static_assert(false)
        // this would make the program ill-formed (*)
}

Or, as pointed by @max66

template < typename REAL_T, std::size_t N1, std::size_t N2, std::size_t N3>
void solve(std::array< REAL_T, N1 * N2 > & A, std::array< REAL_T, N3> & b) {

    static_assert(N3 == std::max(N1, N2), "invalid 3rd dimension");

    if constexpr (N3 == std::max(N1, N2))
        wrap::internal< N1, N2, N3 >(A, b);

}

Tadaa!! Simple, elegant, nice error message.

The difference between the constexpr if version and just a static_assert I.e.:

void solve(...)
{
   static_assert(...);
   wrap::internal(...);
}

is that with just the static_assert the compiler will try to instantiate wrap::internal even on static_assert fail, polluting the error output. With the constexpr if the call to wrap::internal is not part of the body on condition fail so the error output is clean.

(*) The reason I didn't just write static_asert(false, "error msg) is because that would make the program ill-formed, no diagnostics required. See constexpr if and static_assert

You can also make the float / double deductible if you want by moving the template argument after the non-deductible ones:

template < std::size_t N1, std::size_t N2, std::size_t N3,  typename REAL_T>
void solve(std::array< REAL_T, N1 * N2 > & A, std::array< REAL_T, N3> & b) {

So the call becomes:

solve< n1_3, n2_3>(A_3, b_3);

edited Oct 31 '21 at 20:10

marc_s

732,580
175
1,330
1,459

answered Mar 08 '18 at 13:09

bolov

72,283
15
145
224

or simpler: `wrap::internal< N1, N2, N3 >(A, b);` – max66 Mar 08 '18 at 13:11
@max66 yes, indeed. Thank you. – bolov Mar 08 '18 at 13:15
To avoid the `static_assert()` UB risk problem (and to make the solution C++14 compliant) you can avoid the `if` itself: the body of the function can be simply `{ static_assert(N3 == std::max(N1, N2), "invalid 3rd dimension"); wrap::internal< N1, N2, N3 >(A, b); }` – max66 Mar 08 '18 at 13:17
yes... is a good reason... but you can maintain `if constexpr` only for `wrap::internal` and free the `static_assert()` from it: `{ static_assert(N3 == std::max(N1, N2), "invalid 3rd dimension"); if constexpr (N3 == std::max(N1, N3)) wrap::internal< N1, N2, N3 >(A, b); }` – max66 Mar 08 '18 at 13:23
Extremely nice answer. But you changed the template of the solve function, thus the user interface has changed, isn't it? (I'm not fast reasoning with template... probably I'm wrong... ) – Matteo Ragni Mar 08 '18 at 13:23
@MatteoRagni the call is identical – bolov Mar 08 '18 at 13:23
Why would `static_assert(false, "x")` be UB? (Yes, I do want to know.) – You Mar 08 '18 at 13:24
@bolov thus if I call `solve(A, b)` it resolves automatically `N3`? – Matteo Ragni Mar 08 '18 at 13:24
@MatteoRagni yes, because the second argument is a deducible context. – bolov Mar 08 '18 at 13:25
One last question: what makes it C++17? – Matteo Ragni Mar 08 '18 at 13:27
the `if constexpr` is a C++17 feature – bolov Mar 08 '18 at 13:27
regarding the `static_assert(falsle)`. It's a complicated and convoluted issue, but I finally understood it is UB for a class templates as used here: https://stackoverflow.com/questions/30078818/static-assert-dependent-on-non-type-template-parameter-different-behavior-on-gc I think for function templates is OK, but I can't say 100% and the `if constexpr` complicates the matter further. So I just steered to the safe side – bolov Mar 08 '18 at 13:30
I see thank you. I'll review the answers (that are incredible) and I'll decide – Matteo Ragni Mar 08 '18 at 13:30
@bolov: Nothing in that accepted answer implies that `static_assert(false)` would be undefined behavior in the general case. What it's saying is that the template makes the program ill-formed (no diagnostic required) because there is no way to instantiate the template without making the program ill-formed; this is has nothing to do with `static_assert` as such. It certainly wouldn't be different for an expression that is provably false, which is what you already have in the question you linked. – You Mar 08 '18 at 13:45
@You https://stackoverflow.com/questions/49177408/static-assertfalse-in-a-function-main-template – bolov Mar 08 '18 at 16:01
it's official `static_assert(false)` would make the program ill-formed – bolov Mar 08 '18 at 16:06

Arda Aytekin · Answer 2 · 2018-03-08T13:57:06.793

Why don't you try to combine tag dispatch together with some static_asserts? Below is one way of achieving what you want to solve, I hope. I mean, all the three correct cases are properly piped to the correct blas calls, different types and dimension mismatches are handled, and the violation about float and doubles is also handled, all in a user-friendly way, thanks to static_assert.

EDIT. I am not sure about your C++ version requirement, but below is C++11 friendly.

#include <algorithm>
#include <iostream>
#include <type_traits>

template <class value_t, int nrows, int ncols> struct Matrix {};
template <class value_t, int rows> struct Vector {};

template <class value_t> struct blas;

template <> struct blas<float> {
  static void overdet(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
  static void underdet(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
  static void normal(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
};

template <> struct blas<double> {
  static void overdet(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
  static void underdet(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
  static void normal(...) { std::cout << __PRETTY_FUNCTION__ << std::endl; }
};

class overdet {};
class underdet {};
class normal {};

template <class T1, class T2, int nrows, int ncols, int dim>
void solve(const Matrix<T1, nrows, ncols> &lhs, Vector<T2, dim> &rhs) {
  static_assert(std::is_same<T1, T2>::value,
                "lhs and rhs must have the same value types");
  static_assert(dim >= nrows && dim >= ncols,
                "rhs does not have enough space");
  static_assert(std::is_same<T1, float>::value ||
                std::is_same<T1, double>::value,
                "Only float or double are accepted");
  solve_impl(lhs, rhs,
             typename std::conditional<(nrows < ncols), underdet,
             typename std::conditional<(nrows > ncols), overdet,
                                                        normal>::type>::type{});
}

template <class value_t, int nrows, int ncols, int dim>
void solve_impl(const Matrix<value_t, nrows, ncols> &lhs,
                Vector<value_t, dim> &rhs, underdet) {
  /* get the pointers and dimension information from lhs and rhs */
  blas<value_t>::underdet(
      /* trans, m, n, nrhs, A, lda, B, ldb, work, lwork, info */);
}

template <class value_t, int nrows, int ncols, int dim>
void solve_impl(const Matrix<value_t, nrows, ncols> &lhs,
                Vector<value_t, dim> &rhs, overdet) {
  /* get the pointers and dimension information from lhs and rhs */
  blas<value_t>::overdet(
      /* trans, m, n, nrhs, A, lda, B, ldb, work, lwork, info */);
}

template <class value_t, int nrows, int ncols, int dim>
void solve_impl(const Matrix<value_t, nrows, ncols> &lhs,
                Vector<value_t, dim> &rhs, normal) {
  /* get the pointers and dimension information from lhs and rhs */
  blas<value_t>::normal(
      /* trans, m, n, nrhs, A, lda, B, ldb, work, lwork, info */);
}

int main() {
  /* valid types */
  Matrix<float, 2, 4> A1;
  Matrix<float, 4, 4> A2;
  Matrix<float, 5, 4> A3;
  Vector<float, 4> b1;
  Vector<float, 5> b2;
  solve(A1, b1);
  solve(A2, b1);
  solve(A3, b2);

  Matrix<int, 4, 4> A4;
  Vector<int, 4> b3;
  // solve(A4, b3); // static_assert for float & double

  Matrix<float, 4, 4> A5;
  Vector<int, 4> b4;
  // solve(A5, b4); // static_assert for different types

  // solve(A3, b1); // static_assert for dimension problem

  return 0;
}

This is a cool answer, and extremely sophisticate (for me at least). It is out of my league (for now), and I willneed a couple of hour to understand it completely. But unfortunately, the @bolov answer is the one that is simplest to adapt to my current codebase... I upvoted your answer though, I'm sorry I cannot reward you with more... — Matteo Ragni, Mar 08 '18 at 13:47
Thank you, @MatteoRagni, for the upvote :) The above code excerpt, except for the `Matrix` and `Vector` types, is what I use for my project. I needed to wrap `blas` and `lapack` calls. The code, after the `static_assertion`s, relies on zero-sized types to dispatch to the corresponding function calls. The types at the end of `solve_impl` just determine the correct function call at compile time, and the zero-sized objects are optimized out. Of course, you should not pipe `lhs` and `rhs` directly to `blas::*` functions, as you would not want these functions to be compiled for each of the types — Arda Aytekin, Mar 08 '18 at 13:53

WorldSEnder · Answer 3 · 2018-03-08T18:16:54.933

You have to consider why the interface offers this (convoluted) mess of parameters. The author had several things in mind. First of all, you can solve problems of the form A x + b == 0and A^T x + b == 0 in one function. Secondly, the given A and b can actually point to memory in matrices larger than the ones needed by alg. This can be seen by the LDA and LDB parameters.

It is the subaddressing that makes things complicated. If you want a simple but maybe useful enough API, you could chose to ignore that part:

using ::std::size_t;
using ::std::array;

template<typename T, size_t rows, size_t cols>
using matrix = array<T, rows * cols>;

enum class TransposeMode : bool {
  None = false, Transposed = true
};

// See https://stackoverflow.com/questions/14637356/
template<typename T> struct always_false_t : std::false_type {};
template<typename T> constexpr bool always_false_v = always_false_t<T>::value;

template < typename T, size_t rowsA, size_t colsA, size_t rowsB, size_t colsB
    , TransposeMode mode = TransposeMode::None >
void solve(matrix<T, rowsA, colsA>& A, matrix<T, rowsB, colsB>& B)
{
  // Since the algorithm works in place, b needs to be able to store
  // both input and output
  static_assert(rowsB >= rowsA && rowsB >= colsA, "b is too small");
  // LDA = rowsA, LDB = rowsB
  if constexpr (::std::is_same_v<T, float>) {
    // SGELS(mode == TransposeMode::None ? 'N' : 'T', ....);
  } else if constexpr (::std::is_same_v<T, double>) {
    // DGELS(mode == TransposeMode::None ? 'N' : 'T', ....);
  } else {
    static_assert(always_false_v<T>, "Unknown type");
  }
}

Now, addressing the subaddressing possible with LDA and LDB. I propose that you make that part of your data type, not directly part of the template signature. You want to have your own matrix type that can reference storage in a matrix. Perhaps something like this:

// Since we store elements in a column-major order, we can always 
// pretend that our matrix has less columns than it actually has
// less rows than allocated. We can not equally pretend less rows
// otherwise the addressing into the array is off.
// Thus, we'd only four total parameters:
// offset = columnSkipped * actualRows + rowSkipped), actualRows, rows, cols
// We store the offset implicitly by adjusting our begin pointer
template<typename T, size_t rows, size_t cols, size_t actualRows>
class matrix_view { // Name derived from string_view :)
  static_assert(actualRows >= rows);
  T* start;
  matrix_view(T* start) : start(start) {}
  template<typename U, size_t r, size_t c, size_t ac>
  friend class matrix_view;
public:
  template<typename U>
  matrix_view(matrix<U, rows, cols>& ref)
  : start(ref.data()) { }

  template<size_t rowSkipped, size_t colSkipped, size_t newRows, size_t newCols>
  auto submat() {
    static_assert(colSkipped + newCols <= cols, "can only shrink");
    static_assert(rowSkipped + newRows <= rows, "can only shrink");
    auto newStart = start + colSkipped * actualRows + rowSkipped;
    using newType = matrix_view<T, newRows, newCols, actualRows>
    return newType{ newStart };
  }
  T* data() {
    return start;
  }
};

Now, you'd need to adapt your interface to this new datatype, that's basically just introducing a few new parameters. The checks stay basically the same.

// Using this instead of just type-defing allows us to use deducation guides
// Replaces: using matrix = std::array from above
template<typename T, size_t rows, size_t cols>
class matrix {
public:
    std::array<T, rows * cols> storage;
    auto data() { return storage.data(); }
    auto data() const { return storage.data(); }
};

extern void dgels(char TRANS
  , integer M, integer N , integer NRHS
  , double* A, integer LDA
  , double* B, integer LDB); // Mock, missing a few parameters at the end
// Replaces the solve method from above
template < typename T, size_t rowsA, size_t colsA, size_t actualRowsA
    , size_t rowsB, size_t colsB, size_t actualRowsB
    , TransposeMode mode = TransposeMode::None >
void solve(matrix_view<T, rowsA, colsA, actualRowsA> A, matrix_view<T, rowsB, colsB, actualRowsB> B)
{
    static_assert(rowsB >= rowsA && rowsB >= colsA, "b is too small");
    char transMode = mode == TransposeMode::None ? 'N' : 'T';
    // LDA = rowsA, LDB = rowsB
    if constexpr (::std::is_same_v<T, float>) {
      fgels(transMode, rowsA, colsA, colsB, A.data(), actualRowsA, B.data(), actualRowsB);
    } else if constexpr (::std::is_same_v<T, double>) {
      dgels(transMode, rowsA, colsA, colsB, A.data(), actualRowsA, B.data(), actualRowsB);
    // DGELS(, ....);
    } else {
    static_assert(always_false_v<T>, "Unknown type");
    }
}

Example usage:

int main() {
  matrix<float, 5, 5> A;
  matrix<float, 4, 1> b;

  auto viewA = matrix_view{A}.submat<1, 1, 4, 4>();
  auto viewb = matrix_view{b};
  solve(viewA, viewb);
  // solve(viewA, viewb.submat<1, 0, 2, 1>()); // Error: b is too small
  // solve(matrix_view{A}, viewb.submat<0, 0, 5, 1>()); // Error: can only shrink (b is 4x1 and can not be viewed as 5x1)
}

I want to thank you. It is a wonderful answer, well explained and incredibly complete. Unfortunately @bolov gave an answer that is perfect for my case, and requires barely zero effort to be implemented. You have my upvote though. I'm sorry I can't reward you more.... — Matteo Ragni, Mar 08 '18 at 13:34
@MatteoRagni I appreciate it. I'm sure that bolov's answer will help more people in the long run, at least if they land on this question. My answer was more of an exercise myself to think up a way to adapt the interface in question — WorldSEnder, Mar 08 '18 at 13:39
@MatteoRagni I've added an example usage, and changed a bit of code if you're still interested. You might want to reconsider if you ever need a bit more of the functionality provided from behind the scenes — WorldSEnder, Mar 08 '18 at 14:15

Template function overloading and SFINAE implementations

Minimal complete and working example

3 Answers3