Modulus (Float) vs Branch

Question

Given 2 expressions that do the same thing ([-3.14, 3.14] -> [0, 6.28]):

a > 0? a : a + 6.28

or

fmod(a + 6.28, 6.28)

Is there a general difference between the two in performance?

Edit: Suppose such an expression is called many times (such that performance is relevant) and that the input a is different each time. (To make the question more directly answerable).

There is a very simple answer: Measure. Write a benchmark and try it out (make sure the optimizer is turned on). This is the only way to know if one thing is actually faster than the other. — NathanOliver, Nov 16 '18 at 16:05
Sure, but I guess I'm looking for a more rule of thumb answer. If there isn't one, then fine. But I would like to know if there is a general benefit of one over the other. — KobeSystem, Nov 16 '18 at 16:08
godbolt.org can be helpful for this. See the assembly your two solutions produces with gcc : https://godbolt.org/z/vEx4F9 — François Andrieux, Nov 16 '18 at 16:11
Also, another option, [`std::clamp`](https://en.cppreference.com/w/cpp/algorithm/clamp) for C++. — NathanOliver, Nov 16 '18 at 16:12
@FrançoisAndrieux Wow, that's interesting. It doesn't appear to let you see the compilation of 'fmod' though. — KobeSystem, Nov 16 '18 at 16:21
@KobeSystem That's because it should be provided by your standard library implementation. I guess it's not an ideal tool for this specific scenario. But in general when comparing solutions it's handy to take a look at the assembly. — François Andrieux, Nov 16 '18 at 16:23
@NathanOliver clamp is not relevant here (these 2 expressions leave 'a' untouched if it is positive, and shift 'a' to the [3.14, 6.28] range if it is negative). Regardless, I'm looking for a more general answer. François Andrieux's mention of [godbolt.org](http://godbolt.org) is more along the lines of what I'm looking for. — KobeSystem, Nov 16 '18 at 16:28
Usually when working with angles, one would prefer to map from [0, 2π] to [−π, π] rather than the other way around, to benefit from the increased floating-point precision near zero. Mapping a small negative value to a larger value loses accuracy. — Eric Postpischil, Nov 16 '18 at 18:12

chux - Reinstate Monica · Accepted Answer · 2018-11-16T18:43:51.710

// Tertiary
t = a > 0? a : a + 6.28
// vs fmod
m = fmod(a + 6.28, 6.28)

Is there a general difference between the two in performance?

Of course profiling is best @NathanOlive, yet as a general guide, consider optimization potential.

A compiler will typically optimize over the entire range of the type of a, not [-3.14, 3.14]. t, a simple calculation, is readily optimize-able.

Further, depending on FLT_EVAL_METHOD, In C, m calculation is forced into double and certainly a function call. More restrictions mean less optimization possibilities. t may use an optimal FP width.

Recommend a > 0 ? a : a + 6.28 as a general preferred approach.

Given 2 expressions that do the same thing

But they do not do the same thing over domain [-3.14, 3.14]

About 1/4 of all double are in the range [0...1.0]. m usage of a + 6.28 will lose at least 3 to all bits of precision with that addition. Advantage: t.

Ranges differ:
The range of t is [0, 6.28]
The range of m is [0, 6.28), not [0, 6.28]

Consideration about the higher goal

It is apparent code is attempting trigonometric range reduction. To do this well is harder than the basic sine. cosine, tangent calculation itself. See ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit.

If code is starting with degrees rather than radians, consider the advantages of range reduction in degrees first.

Bigger picture

Depending on how a is derived or t, m are used, even better performance ideas are possible. So if performance is truly a concern, the surrounding code is needed, else we are incorrectly micro-optimizing.

Modulus (Float) vs Branch

1 Answers1