Is operator ≤ UB for floating point comparison?

Question

There are numerous reference on the subject (here or here). However I still fails to understand why the following is not considered UB and properly reported by my favorite compiler (insert clang and/or gcc) with a neat warning:

// f1, f2 and epsilon are defined as double
if ( f1 / f2 <= epsilon )

As per C99:TC3, 5.2.4.2.2 §8: we have:

Except for assignment and cast (which remove all extra range and precision), the values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. [...]

Using typical compilation f1 / f2 would be read directly from the FPU. I've tried here using gcc -m32, with gcc 5.2. So f1 / f2 is (over-here) on an 80 bits (just a guess dont have the exact spec here) floating point register. There is not type promotion here (per standard).

I've also tested clang 3.5, this compiler seems to cast the result of f1 / f2 back to a normal 64 bits floating point representation (this is an implementation defined behavior but for my question I prefer the default gcc behavior).

As per my understanding the comparison will be done in between a type for which we don't know the size (ie. format whose range and precision may be greater) and epsilon which size is exactly 64 bits.

What I really find hard to understand is equality comparison with a well known C types (eg. 64bits double) and something whose range and precision may be greater. I would have assumed that somewhere in the standard some kind of promotion would be required (eg. standard would mandates that epsilon would be promoted to a wider floating point type).

So the only legitimate syntaxes should instead be:

if ( (double)(f1 / f2) <= epsilon )

or

double res = f1 / f2;
if ( res <= epsilon )

As a side note, I would have expected the litterature to document only the operator <, in my case:

if ( f1 / f2 < epsilon )

Since it is always possible to compare floating point with different size using operator <.

So in which cases the first expression would make sense ? In other word, how could the standard defines some kind of equality operator in between two floating point representation with different size ?

EDIT: The whole confusion here, was that I assumed it was possible to compare two float of different size. Which cannot possibly happen. (thanks @DevSolar!).

I don't understand your problem here. The part of the standard you quoted actually says that the promotion to `double` will automatically be done during evaluation, so the explicit cast is not needed. — NiBZ, Sep 11 '15 at 11:28
Why, why, why would it be *undefined behavior*? I see nothing in your standard quote that indicates that. And nothing in the questions you linked to. Yes, it can sometimes produce counter intuitive results. But what does that have to do with UB? The results are perfectly defined, they just occasionally disagree with naive expectations. — , Sep 11 '15 at 11:29
@NiBZ you've read it backward. `f1 / f2` is *at least* the precision of double, not the other way around. — malat, Sep 11 '15 at 11:47
Please re-phrase and post a new question, stepping back from a couple of your assumptions and instead of asking about conclusions you made from those assumptions, question the assumptions themselves. It should be clear (from the lengthy discussions here and the close votes) that people have problems figuring out what, exactly, you are asking. — DevSolar, Sep 11 '15 at 12:37
@DevSolar I've not changed a bit of initial question. I've explained some intermediate steps which seems obvious to me, but apparently not by some admins here. My interrogations remains exactly the same. Thanks for your help. — malat, Sep 11 '15 at 12:46
I am really not trying to offend, but to help you with understanding the matter that has you confused. `<=` is well-defined, which is answering your question's title and your first paragraph. The close votes are (most likely) coming from the part of the question that (ironically) starts with "clearly", because that is where you are losing people, i.e. "unclear what you're asking". That's why I was suggesting you post a new question that doesn't involve "clearly", but rather asks if the things that seem "clear" to you are actually fact. — DevSolar, Sep 11 '15 at 12:58
Looking at your latest edit, what makes you assume that `epsilon` is *not* promoted to that potentially wider format before the comparison? — DevSolar, Sep 11 '15 at 13:10
`If an operation involves two operands, and one of them is of type double, the other one is converted to double.`. There is no `long double` in my code. — malat, Sep 11 '15 at 13:12
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/89386/discussion-between-devsolar-and-malat). — DevSolar, Sep 11 '15 at 13:15

score 4 · Answer 1 · answered Sep 11 '15 at 11:35

4

<= is well-defined for all possible floating point values.

There is one exception though: the case when at least one of the arguments is uninitialised. But that's more to do with reading an uninitialised variable being UB; not the <= itself

answered Sep 11 '15 at 11:35

Bathsheba

231,907
34
361
483

so you're saying that `epsilon` (64bits) gets copied over to the FPU with 80 bits precision -for example- before doing the equality ? – malat Sep 11 '15 at 11:39
1

The standard says "may be". That implies "may be not". But why would it matter, and where do you see UB? – DevSolar Sep 11 '15 at 11:40
Could you paste this section please ? – malat Sep 11 '15 at 11:41
@malat: I actually took it from the very section *you* posted. – DevSolar Sep 11 '15 at 11:42
I still fail to understand how `equality` operator can be a defined behavior, when comparison occurs in between a well known defined C type and something where `precision may be greater than` – malat Sep 11 '15 at 11:45
@malat: Comparing floats for equality is considered bad because numbers that *should* be *exactly* equivalent might not be, because of imprecisions involved in the binary representation. That doesn't make the operation undefined behaviour. And while `==` is usually a bad (semantic) idea exactly for this reason, `<=` (and by implication, `>=`) are quite safe, both syntactically and semantically. – DevSolar Sep 11 '15 at 11:47
2

@DevSolar: *often* considered bad. Comparing floats for equality is perfectly acceptable so long as you know what you're doing. Integers up to 2^53, for example, can be compared perfectly in 64 bit IEEE754. – Bathsheba Sep 11 '15 at 11:48
1

@Bathsheba: But the compiler warnings OP is referring to don't care for that, they consider *all* `==` applied to floating point numbers "bad". But -- and this is directed at the OP -- "generates a warning" is not equivalent to "is UB". – DevSolar Sep 11 '15 at 11:50
@DevSolar I still does not have an answer on how truncation is defined. Comparing two 64 bits double for equality is dumb but perfectly defined. Comparing an 80bits float and a 64 bits float makes my head spin. – malat Sep 11 '15 at 11:51
2

@malat: Why would "evaluating to a format whose range and precision [maybe being] greater than required by the type" -- like, putting the 64-bit in-memory `double` values of `f1`, `f2`, and / or `epsilon` into 80-bit FPU registers -- result in *truncation*? You're not making sense here. – DevSolar Sep 11 '15 at 12:14
@malat what's the problem? The same way you can compare 8-bit integer and a 16-, 32- and 64-bit integer, you could compare floats of different size, by promoting smaller one to the size of larger one. – el.pescado - нет войне Sep 11 '15 at 12:18
promoting a type to a wider type doesn't change the value, why should it be UB? Are there any cases where it's unsafe to do so? – phuclv Sep 11 '15 at 12:19
@malat moreover, comparing doubles isn't dumb. `2.0 + 2.0 == 4.0`, as long you don't use Intel Pentium;) – el.pescado - нет войне Sep 11 '15 at 12:22
@malat: And before you ask, once the calculation is being done (and you're doing the assignment or cast mentioned by the standard), you're putting those 80-bit register values back into the 64-bit format in our example. Yes, there would be truncation going on at that point, but you're truncating range / precision that a 64-bit register wouldn't have provided in the first place, so you cannot be worse off than without the 80-bit. – DevSolar Sep 11 '15 at 12:23
@el.pescado 8,16, 32 and 64 are all defined C types, `format whose range and precision may be greater` is not. When type are different there is automatic promotion, not in my case (AFAIK). – malat Sep 11 '15 at 12:29

score 3 · Answer 2 · edited May 23 '17 at 10:28

3

I think you're confusing implementation-defined with undefined behavior. The C language doesn't mandate IEEE 754, so all floating point operations are essentially implementation-defined. But this is different from undefined behavior.

edited May 23 '17 at 10:28

Community

1
1

answered Sep 11 '15 at 11:51

nwellnhof

32,319
7
89
113

DevSolar · Accepted Answer · 2015-09-11T14:00:21.317

After a bit of chat, it became clear where the miscommunication came from.

The quoted part of the standard explicitly allows an implementation to use wider formats for floating operands in calculations. This includes, but is not limited to, using the long double format for double operands.

The standard section in question also does not call this "type promotion". It merely refers to a format being used.

So, f1 / f2 may be done in some arbitrary internal format, but without making the result any other type than double.

So when the result is compared (by either <= or the problematic ==) to epsilon, there is no promotion of epsilon (because the result of the division never got a different type), but by the same rule that allowed f1 / f2 to happen in some wider format, epsilon is allowed to be evaluated in that format as well. It is up to the implementation to do the right thing here.

The value of FLT_EVAL_METHOD might tell what exactly an implementation is doing exactly (if set to 0, 1, or 2 respectively), or it might have a negative value, which indicates "indeterminate" (-1) or "implementation-defined", which means "look it up in your compiler manual".

This gives an implementation "wiggle room" to do any kind of funny things with floating operands, as long as at least the range / precision of the actual type is preserved. (Some older FPUs had "wobbly" precisions, depending on the kind of floating operation performed. The quoted part of the standard caters for exactly that.)

In no case may any of this lead to undefined behaviour. Implementation-defined, yes. Undefined, no.

thanks for the crystal clear answer ! – malat Sep 11 '15 at 13:53 — malat, Sep 11 '15 at 13:53

score 1 · Answer 4 · answered Sep 11 '15 at 12:37

The only case where you would get undefined behavior is when a large floating point variable gets demoted to a smaller one which cannot represent the contents. I don't quite see how that applies in this case.

The text you quote is concerned about whether or not floats may be evaluated as doubles etc, as indicated by the text you unfortunately didn't include in the quote:

The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD:

-1 indeterminable;

0 evaluate all operations and constants just to the range and precision of the type;

1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type;

2 evaluate all operations and constants to the range and precision of the long double type.

However, I don't believe this macro overwrites the behavior of the usual arithmetic conversions. The usual arithmetic conversions guarantee that you can never compare two float variables of different size. So I don't see how you could run into undefined behavior here. The only possible issue you would have is performance.

In theory, in case FLT_EVAL_METHOD == 2 then your operands could indeed get evaluated as type long double. But please note that if the compiler allows such implicit promotions to larger types, there will be a reason for it.

According to the text you cited, explicit casting will counter this compiler behavior.

In which case the code if ( (double)(f1 / f2) <= epsilon ) is nonsense. By the time you cast the result of f1 / f2 to double, the calculation is already done and have been carried out on long double. The calculation of the result <= epsilon will however be carried out on double since you forced this with the cast.

To avoid long double entirely, you would have to write the code as:

if ( (double)((double)f1 / (double)f2) <= epsilon )

or to increase readability, preferably:

double div = (double)f1 / (double)f2;
if( (double)div <= (double)epsilon )

But again, code like this does only make sense if you know that there will be implicit promotions, which you wish to avoid to increase performance. In practice, I doubt you'll ever run into that situation, as the compiler is most likely far more capable than the programmer to make such decisions.

Is operator ≤ UB for floating point comparison?

4 Answers4