3

I am writing a code that has to compile in single and double precision. The original version was only in double precision, but I am trying to enable single precision now by using templates.

My question is: is it necessary to cast the 1. and the 7.8 to the specified type with static_cast<TF>(1.) for instance, or will the compiler take care of it? I find the casting not remarkably pretty and prefer to stay away from it. (I have other functions that are much longer and that contain many more literal constants).

template<typename TF>
inline TF phih_stable(const TF zeta)
{
    // Hogstrom, 1988
    return 1. + 7.8*zeta;
}
Chiel
  • 6,006
  • 2
  • 32
  • 57
  • 1
    If you do not cast, `zeta` will be converted to double precision, then the computation will be made in double precision, and the result will be cast to single precision. Said otherwise, you could use the double precision function with a single precision float and you'd have the same result. – Holt Mar 30 '18 at 09:18
  • @Holt. The single precision functions will be used in CUDA and for performance reasons I do not want to have any double precision operation in there. So I guess, that means I have to add the casts, right? – Chiel Mar 30 '18 at 09:19
  • Yes, you need the cast. Slightly less verbose would be to write `TF(7.8)` instead of `static_cast(7.8)`. – Holt Mar 30 '18 at 09:22
  • Why cast ?? Use `1.f`... – llllllllll Mar 30 '18 at 09:22
  • 1
    @liliscent He wants different types depending on `TF` – Barmar Mar 30 '18 at 09:23
  • @liliscent Depending on the literal, you could lose double precision if you use only single precision literals unless you separate the two overloads. – Holt Mar 30 '18 at 09:23
  • @Barmar For this particular example, if `TF` is double, there will be implicit conversion to `double`. – llllllllll Mar 30 '18 at 09:24
  • Ulrich answer is good, you may also read https://stackoverflow.com/questions/33163772/what-is-the-difference-between-casting-to-float-and-adding-f-as-a-suffix-whe – Jean-Baptiste Yunès Mar 30 '18 at 09:24
  • @Holt My solution exploits the implicit conversion. For some cases, it's ok. But generally error-prone. – llllllllll Mar 30 '18 at 09:26
  • OT: Consider using [`constexpr`](http://en.cppreference.com/w/cpp/language/constexpr) rather than [`inline`](http://en.cppreference.com/w/cpp/language/inline), for functions like this. – Bob__ Mar 30 '18 at 09:29
  • @Bob__. What does that change? – Chiel Mar 30 '18 at 09:30
  • From cppreference *"(since C++11) A function declared constexpr is implicitly an inline function."*, while *"The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.*". Not relevant to your question, I know, it's off topic. – Bob__ Mar 30 '18 at 09:34

2 Answers2

3

Casting and implicit conversions are two things. For this example, you could treat the template function as if it were two overloaded functions, but with the same code within. At the interface level (parameters, return value) the compiler will generate implicit conversions.

Now, the question you will have to ask yourself is this: Do those implicit conversions do what I want? If they do, just leave it as it is. If they don't, you could try to add explicit conversions (maybe use the function-style casting like TF(1.)) or, you could specialize this function for double and float.

Another option, less general but maybe it works here, is that you switch the code around, i.e. that you write the code for single-precision float and then let the compiler apply its implicit conversions. Since the conversions usually only go to the bigger type, it should fit for both double and float without incurring any overhead for float.

Ulrich Eckhardt
  • 16,572
  • 3
  • 28
  • 55
  • What I want, is that when `TF` is `float`, that all operations are done in single precision. I like your last idea. Does this go well in all conditions? – Chiel Mar 30 '18 at 09:25
  • I can't forsee "all conditions", but the approach should work. If you have tests in place, you can make sure it does, and if it doesn't you could still specialize the template. – Ulrich Eckhardt Mar 30 '18 at 09:27
  • So in case `TF` is `double` and `1.f + 7.8f*zeta` is done, what exactly does the compiler do and in which order are the casts done? Can you maybe add that to your answer? – Chiel Mar 30 '18 at 09:33
  • 1
    @Chiel Everything will be converted to `double` in the end, `7.8f` first, to compute `7.8 * zeta`, then `1.f` to compute the final result. The problem here is that `7.8f` is not `7.8`, you may lose precision here (I don't know if the standard is allowed to actually use the literal `7.8` instead of `7.8f` here when `zeta` is `double`). – Holt Mar 30 '18 at 09:43
  • @Holt I'd have to check, but I think it does not. – YSC Mar 30 '18 at 09:50
1

When you do:

return 1. + 7.8*zeta;

The literals 1. and 7.8 are double, so when if zeta is a float, it will first be converted to double, then the whole computation will be done in double-precision, and the result will be cast back to float, this is equivalent to:

return (float)(1. + 7.8 * (double)zeta);

Otherwise said, this is equivalent to calling phih_stable(double) and storing the result in a float, so your template would be useless for float.

If you want the computation to be made in single precision, you need the casts1:

return TF(1.) + TF(7.8) * zeta;

What about using 1.f and 7.8f? The problem is that (double)7.8f != 7.8 due to floating point precision. The difference is around 1e-7, the actual stored value for 7.8f (assuming 32-bits float) is:

7.80000019073486328125

While the actual stored value for 7.8 (assuming 64-bits double) is:

7.79999999999999982236431605997

So you have to ask yourself if you accept this loss of precision.


You can compare the two following implementations:

template <class T>
constexpr T phih_stable_cast(T t) {
    return T(1l) + T(7.8l) * t;
}

template <class T>
constexpr T phih_stable_float(T t) {
    return 1.f + 7.8f * t;
}

And the following assertions:

static_assert(phih_stable_cast(3.4f) == 1. + 7.8f * 3.4f, "");
static_assert(phih_stable_cast(3.4) == 1. + 7.8 * 3.4, "");
static_assert(phih_stable_cast(3.4l) == 1. + 7.8l * 3.4l, "");
static_assert(phih_stable_float(3.4f) == 1.f + 7.8f * 3.4f, "");
static_assert(phih_stable_float(3.4) == 1. + 7.8 * 3.4, "");
static_assert(phih_stable_float(3.4l) == 1.l + 7.8l * 3.4l, "");

The two last assertions fail due to the loss of precision when doing the computation.

1 You should even downcast from long double to not lose precision when using your function with long double: return TF(1.l) + TF(7.8l) * zeta;.

Holt
  • 36,600
  • 7
  • 92
  • 139