6

I have the following function:

char f1( int a, unsigned b ) { return abs(a) <= b; }

For execution speed, I want to rewrite it as follows:

char f2( int a, unsigned b ) { return (unsigned)(a+b) <= 2*b; } // redundant cast

Or alternatively with this signature that could have subtle implications even for non-negative b:

char f3( int a, int b )      { return (unsigned)(a+b) <= 2*b; }

Both of these alternatives work under a simple test on one platform, but I need it to portable. Assuming non-negative b and no risk of overflow, is this a valid optimization for typical hardware and C compilers? Is it also valid for C++?


Note: As C++ on gcc 4.8 x86_64 with -O3, f1() uses 6 machine instructions and f2() uses 4. The instructions for f3() are identical to those for f2(). Also of interest: if b is given as a literal, both functions compile to 3 instructions that directly map to the operations specified in f2().

Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
  • What part of this do you suspect is undefined behavior and why? – NathanOliver Aug 09 '16 at 17:36
  • @NathanOliver: The cast to `unsigned` may reinterpret the bits. – Brent Bradburn Aug 09 '16 at 17:37
  • Why cast to unsigned if the result can be negative? – Eugene Sh. Aug 09 '16 at 17:38
  • 2
    Why do you think this implementation would be more optimized? Did you check what the compiler emits acually? – πάντα ῥεῖ Aug 09 '16 at 17:39
  • 4
    That cast is completely superfluous anyway. – T.C. Aug 09 '16 at 17:39
  • 1
    @nobar It does not reinterpret any bits, it yields the representative of the casted number's equivalence class that fits into the range of `unsigned`. (Well, it might do that by reinterpreting bits, but you need not care about that.) – Baum mit Augen Aug 09 '16 at 17:40
  • 1
    In the second case that cast may result in a huge number if the sum turns out to be negative, and this function will most likely return a `1` when it should actually be a zero. – ForceBru Aug 09 '16 at 17:41
  • 1
    @EugeneSh. @T.C.: Without the cast, it doesn't work correctly. Negative numbers are mapped into the high-end of positive values -- which are above the `b` threshold. – Brent Bradburn Aug 09 '16 at 17:41
  • @πάνταῥεῖ: The bottom part of my question addresses your comment directly. – Brent Bradburn Aug 09 '16 at 17:42
  • 1
    @nobar If you see a difference without the cast, I would like to see a MCVE for that. `a` gets promoted to `unsigned` before the addition is performed anyways, and the result is an `unsigned` again. (At least in C++, but afaik that holds in C too.) – Baum mit Augen Aug 09 '16 at 17:43
  • Did you try `return ((a<0) ? (a<=b) : (-a<=b));` ? – dbush Aug 09 '16 at 17:44
  • 1
    @dbush: Just tested your suggestion (with fix to `a>0`). This produces 5 instructions of which one is a branch (that I would like to avoid). – Brent Bradburn Aug 09 '16 at 17:47
  • 5
    `a*a <= b*b` - no branches..Overflow danger, though – Eugene Sh. Aug 09 '16 at 17:49
  • 2
    @BaummitAugen: I made a mistake here due to testing multiple scenarios. The cast *was* superfluous as this was originally posted because it is implicit based on the type of `b` being `unsigned`. I should have left `b` as type `int`, and will make that edit now. – Brent Bradburn Aug 09 '16 at 17:59
  • 3
    Did you check that the 6 instruction are actually slower than the 4 instructions? That's not always the case. – rjp Aug 09 '16 at 18:00
  • @rjp: Good point. But the instruction counts given here are only for the sake of interest. I want to give the formula that is generally simplest in order to help the compiler. Real testing on my deeply embedded platform will come later. – Brent Bradburn Aug 09 '16 at 18:05
  • The two functions don't return the same values. http://ideone.com/ahbzMk – kfsone Aug 09 '16 at 18:11
  • 2
    @kfsone: `b` is meant to always be non-negative. – Brent Bradburn Aug 09 '16 at 18:15
  • 1
    Then it should be `char f2(int a, unsigned int b) { return unsigned(a) + b <= 2*b;}`. – kfsone Aug 09 '16 at 18:17
  • 1
  • 1
    Language lawyering is hard :). I will revert and expand the question. In my mind, the exact way that I specified the function isn't the interesting thing -- but rather it's the idea of using a wraparound in this way. So an answer would be fine if it noted a slightly different way to write the function as long as the wraparound concern was addressed. – Brent Bradburn Aug 09 '16 at 18:28
  • 2
    The answers are confusing because you changed the question after you got two answers. – yellowantphil Aug 09 '16 at 18:32
  • 1
    "is this a valid optimization for typical hardware and C compilers?" and `unsigned(a+b)` are confusing as that code does not compile in C. – chux - Reinstate Monica Aug 09 '16 at 18:38
  • @PSkocik Unclear, what is the usefulness of `int a; ... (int)((unsigned int)a)`? Is not the result typically `a`? – chux - Reinstate Monica Aug 09 '16 at 18:45
  • 1
    @rjp: I agree it's sloppy and a bad idea to just talk about instruction counts for micro-optimizations like this. But in this case, the instructions in question are all cheap (according to the compiler output: https://godbolt.org/g/v9XcyE), and if anything the version with more instructions uses a more expensive instruction. (`sar` can't run on as many execution ports on Intel SnB-family CPUs.) Two independent `add`s are cheaper than `mov` / `sar` / `xor` / `sub`. clang uses mov/neg/cmov, which is still more expensive (especially on pre-Broadwell). (See the x86 tag wiki) – Peter Cordes Aug 09 '16 at 18:53
  • 1
    Suggest `inline bool f2(int a, int b) { return a+b <= 2u * b; }` (`bool` and `2u`) – chux - Reinstate Monica Aug 09 '16 at 19:02
  • 1
    And BTW, the original `_Bool f2_unsigned( int a, unsigned b ) { return (unsigned)(a+b)<=2*b; }` compiles to identical x86-64 code as the updated `_Bool f2_signed( int a, int b ) { return (unsigned)(a+b)<=2*b; }`. https://godbolt.org/g/3MdK8o (also including chux's version). (I assume you'd only use this as an inline function, so using `_Bool` to indicate to the caller that the return value is guaranteed to be a zero or one, not just any non-zero value doesn't gain anything.) `_Bool` is a C99 built-in type; `#include ` for the typedef for `bool`. – Peter Cordes Aug 09 '16 at 19:03
  • 1
    @Peter Cordes Any particular reason/need for the `(unsigned)` cast? `_Bool f2_unsigned( int a, unsigned b ) { return a+b <= 2*b; }` looks sufficient. – chux - Reinstate Monica Aug 09 '16 at 19:05
  • 2
    @chux: no. Those are the OP's code, not my suggestion. Just pointing out that the controversial edit that changes the question from a language lawyer perspective doesn't change the asm for gcc targeting x86, for whatever that's worth. (And no, that doesn't mean it's necessarily safe or the same after inlining). Your `2*b` or `2u * b` code compiles to the same good code as the OP's more efficient possibly-unsafe version. – Peter Cordes Aug 09 '16 at 19:06
  • @chux: Yes, thanks. Using `2u` does resolve a "sign-compare" warning in my `f3()`. – Brent Bradburn Aug 09 '16 at 19:43
  • Oops, update: making `b` signed does change the asm for `f1`, but not for `f2`. This is because `abs(int)` still returns `int`, so `f1(int, int)` does a signed compare (`setle` instead of `setbe`), like `f3`. [All versions on the Godbolt compiler explorer](https://godbolt.org/g/bOZsa1). – Peter Cordes Aug 09 '16 at 22:27
  • Have you consider a macro? `#define F4(a,b) ((unsigned)((a)+(b)) <= 2u*(b))`? – chux - Reinstate Monica Aug 10 '16 at 14:47
  • @chux: I'm assuming that the functions would be inlined by the compiler -- but if not, I would probably use a macro like the one you suggest. – Brent Bradburn Aug 11 '16 at 03:47

4 Answers4

3

Starting with the original code with signature

char f2( int a, unsigned b );

this contains the expression

a + b

Since one of these operands has a signed and the other an (corresponding) unsigned integer type (thus they have the same "integer conversion rank"), then - following the "Usual arithmetic conversions" (§ 6.3.1.8) - the operand with signed integer type is converted to the unsigned type of the other operand.

Conversion to an unsigned integer type is well defined, even if the value in question cannot be represented by the new type:

[..] if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 60

§ 6.3.1.3/2

Footnote 60 just says that the described arithmetic works with the mathematical value, not the typed one.

Now, with the updated code

char f2_updated( int a, int b ); // called f3 in the question

things would look different. But since b is assumed to be non-negative, and assuming that INT_MAX <= UINT_MAX you can convert b to an unsigned without fearing it to have a different mathematical value afterwards. Thus you could write

char f2_updated( int a, int b ) {
  return f2(a, (unsigned)b); // cast unnecessary but to make it clear
}

Looking again at f2 the expression 2*b further limits the allowed range of b to be not larger than UINT_MAX/2 (otherwise the mathematical result would be wrong). So as long as you stay within these bounds, every thing is fine.

Note: Unsigned types do not overflow, they "wrap" according to modular arithmetic.

Quotes from N1570 (a C11 working draft)


A final remark:

IMO the only really reasonable choice to write this function is as

#include <stdbool.h>
#include <assert.h>
bool abs_bounded(int value, unsigned bound) {
  assert(bound <= (UINT_MAX / 2));
  /* NOTE: Casting to unsigned makes the implicit conversion that
           otherwise would happen explicit. */
  return ((unsigned)value + bound) <= (2 * bound);
}

Using a signed type for the bound does not make much sense, because the absolute of a value cannot be less than a negative number. abs_bounded(value, something_negative) would be always false. If there's the possibility of a negative bound, then I'd catch this outside of this function (otherwise it does "too much"), like:

int some_bound;
// ...
if ((some_bound >= 0) && abs_bounded(my_value, some_bound)) {
  // yeeeha
}
Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • 2
    Arr ... that edit changing the b type to a signed integer came while I wrote the answer. – Daniel Jour Aug 09 '16 at 18:03
  • 1
    I really want to roll it back as it makes this answer and all of the comments pointless. – NathanOliver Aug 09 '16 at 18:08
  • @NathanOliver: The comments noted that the cast was superfluous, but that didn't affect the point of the question. – Brent Bradburn Aug 09 '16 at 18:11
  • 1
    @nobar: The cast was superfluous because it applied *after* promotion happened while evaluating `a+b`. Changing `b`'s type *does* have an effect. Read carefully what people are saying about how promotion happens when a binary operator has two operands of different types. – Peter Cordes Aug 09 '16 at 18:14
  • @nobar The edit changes the behavior of the entire code block as unsigned integer overflow is well defined. Now that you do not have that you have a different question. – NathanOliver Aug 09 '16 at 18:19
  • 5
    @EOF I tend to disagree: " — The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any." Equal rank means conversion to unsigned. – Daniel Jour Aug 09 '16 at 18:25
  • @EOF No problem, I had to look it up myself ^^ I reworded it to be hopefully less hand waving but still concise. – Daniel Jour Aug 10 '16 at 08:11
  • "Using a signed type .. does not make much sense" -- I think this question demonstrated that using `unsigned` types is generally a bad idea. Look at the confusion it caused. I actually only did it for documentation purposes -- because it was more concise than adding `assert(bound>=0)`, but it turned out to be a mistake. In my opinion, [the "usual arithmetic conversions" are not sensible](http://stackoverflow.com/q/38861423/86967) if you assume that arithmetic is more likely to involve negative values than to overflow signed values. – Brent Bradburn Aug 10 '16 at 14:48
  • @nobar This question demonstrated that editing the question (or not posting the actual code one is working on in the first place) may indeed cause lot's of confusion ;) But I agree that the conversions are confusing. Banning `unsigned` is a far too drastic move IMO, though. – Daniel Jour Aug 10 '16 at 15:26
2

To determine if the 2 expressions are equivalent for your purpose, you must study the domain of definition:

  • abs(a) <= b is defined for all values of int a and unsigned b, with just one special case for a = INT_MIN;. On 2s complement architectures, abs(INT_MIN) is not defined but most likely evaluates to INT_MIN, which converted to unsigned as required for the <= with an unsigned value, yields the correct value.

  • (unsigned)(a+b) <= 2*b may produce a different result for b > UINT_MAX/2. For example, it will evaluate to false for a = 1 and b = UINT_MAX/2+1. There might be more cases where you alternate formula gives an incorrect result.

EDIT: OK, the question was edited... and b is now an int.

Note that a+b invokes undefined behavior in case of overflow and the same for 2*b. So you make the assumption that neither a+b nor 2*b overflow. Furthermore, if b is negative, you little trick does not work.

If a is in the range -INT_MAX/2..INT_MAX/2 and b in the range 0..INT_MAX/2, it seems to function as expected. The behavior is identical in C and C++.

Whether it is an optimization depends completely on the compiler, command line options, hardware capabilities, surrounding code, inlining, etc. You already address this part and tell us that you shave one or two instructions... Just remember that this kind of micro-optimization is not absolute. Even counting instructions does not necessarily help find the best performance. Did you perform some benchmarks to measure if this optimization is worthwhile? Is the difference even measurable?

Micro-optimizing such a piece of code is self-defeating: it makes the code less readable and potentially incorrect. b might not be negative in the current version, but if the next maintainer changes that, he/she might not see the potential implications.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • This answer seems to address only the boundary issues. I attempted to explicitly exclude these by saying "no risk of overflow". – Brent Bradburn Aug 09 '16 at 20:39
  • @nobar: you only did so after I pointed them out. Modifying the question in a way that makes the answers irrelevant should only be done with EDIT: paragraphs. I answered the **valid** part of the question and hinted as to when it is **invalid**. See my updated answer for the **optimization** part of the question. – chqrlie Aug 10 '16 at 10:03
2

As OP wants fast and portable code (and b is positive), it first makes sense to code safely:

// return abs(a) <= b;
inline bool f1_safe(int a, unsigned b ) { 
  return (a >= 0 && a <= b) || (a < 0 && 0u - a <= b);
}

This works for all a,b (assuming UINT_MAX > INT_MAX). Next, compare alternatives using an optimized compile (let the compiler do what it does best).


The following slight variation on OP's code will work in C/C++ but risks portability issues unless "Assuming non-negative b and no risk of overflow" can be certain on all target machines.

bool f2(int a, unsigned b) { return a+b <= b*2; }

In the end, OP goal of fast and portable code may find code the works optimally for the select platform, but not with others - such is micro-optimization.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • To make no `UINT_MAX > INT_MAX` assumption, use `return (a >= 0 && a <= b) || (b && a < 0 && -1 - a <= b-1) || (b==0 && a== 0);` If `b` is a constant, much code will optimize out. – chux - Reinstate Monica Aug 09 '16 at 20:25
1

Yes, this is portable to compliant platforms. The conversion from signed to unsigned is well defined:

The description in the C spec is a bit contrived:

if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

The C++ spec addresses the same conversion in a more sensible way:

In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern


In the question, f2() and f3() achieve the same results in a slightly different way.

  • In f2() the presence of the unsigned operand causes a conversion of the signed operand as required here for C++. The unsigned addition may-or-may-not then result in a wrap-around past zero, which is also well defined [citation needed].
  • In f3() the addition occurs in signed representation with no trickiness, and then the result is (explicitly) converted to unsigned. So this is slightly simpler than f2() (and also more clear).

In both cases, the you end up with the same unsigned representation of the sum, which can then be compared (as unsigned) to 2*b. And the trick of treating a signed value as an unsigned type allows you to check a two-sided range with only a single comparison. Note also that this is a bit more flexible than using the abs() function since the trick doesn't require that the range be centered around zero.


Commentary on the "usual arithmetic conversions"

I think this question demonstrated that using unsigned types is generally a bad idea. Look at the confusion it caused here.

It can be tempting to use unsigned for documentation purposes (or to take advantage of the shifted value range), but due to the conversion rules, this may tend to be a mistake. In my opinion, the "usual arithmetic conversions" are not sensible if you assume that arithmetic is more likely to involve negative values than to overflow signed values.

I asked this followup question to clarify the point: mixed-sign integer math depends on variable size. One new thing that I have learned is that mixed-sign operations are not generally portable because the conversion type will depend on the size relative to that of int.

In summary: Using type declarations or casts to perform unsigned operations is a low-level coding style that should be approached with the requisite caution.

Community
  • 1
  • 1
Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
  • You could be more specific and say "two's complement platforms", not just "compliant platforms", if that's the requirement. That latter is like saying "it works on platforms where it works". What happens when you cast a negative `int` to unsigned on a one's complement or sign/magnitude machine? And is this code still safe on such a machine? It's still useful as long as it's free of possible signed overflows or other UB, but it would be interesting to know if it's safe in general, or exactly where it would go wrong on non-two's complement C implementations. – Peter Cordes Aug 10 '16 at 01:26
  • @PeterCordes: By "compliant", I meant "language-standard compliant". I think the standard pretty-much implies two's complement, without out-right requiring it. As I understand it, weird hardware would have to use workarounds to achieve language compliance. Workarounds like "repeatedly adding or subtracting one more than the maximum value that can be represented...". – Brent Bradburn Aug 10 '16 at 01:33
  • Yeah, I haven't thought through the implications of that wording either, but I'm not sure it's very expensive. It probably takes at most one branch to implement that wording, and casts from unsigned to int don't happen all the time. It's disappointing the standard refuses to actually standardize two's complement for anything other than atomic `int`, or stuff like arithmetic right shift of signed `int`. The standard is too weak for its own good; I like what I've seen of Rust so far (esp. built-in popcnt and stuff), but haven't really done anything with it. – Peter Cordes Aug 10 '16 at 01:38
  • The version which adds two `int` values without casting or coercing either to an unsigned type first will invoke UB for some combinations of operands. Even thought the C89 rationale implies an intention that when code performs arithmetic on two int values and immediately converts the result to an unsigned type which is not larger than "int", silent-wraparound two's-complement implementations should behave as though the operation was performed on unsigned values (since there's no defined case where it wouldn't) some compilers use integer overflow even in such cases as an excuse for wackiness. – supercat Aug 10 '16 at 21:12
  • 1
    One of the problems faced by the authors of C89 was the fact that some implementations promoted short unsigned types to int, while others promoted to unsigned. It's too bad C89 didn't resolve this issue by defining qualifiers to distinguish e.g. 16-bit types that represent numbers 0-65535 from 16-bit types that represent mod-65536 equivalence classes. Unqualified unsigned types could then behave in Implementation-defined fashion, while code wanting numbers or modular arithmetic could specify what it wanted. – supercat Aug 10 '16 at 21:24
  • @supercat: I'm assuming you're talking about signed integer overflow when you mention UB. I'm not concerned about that here. I figure UB on signed integer overflow is a reasonable outcome. It's up to the programmer to make sure the type is large enough for the math. – Brent Bradburn Aug 11 '16 at 03:42
  • 1
    @nobar: When the C Standard was written, and for a long time after, the majority of (e.g. 16-bit) compilers would have processed f3(16384,24576) by performing an addition that yields either 40960 or -24576 and a multiplication that yields either 49152 or -16384. It would then convert both results to unsigned (yielding 40960 and 49152, respectively), and performed the comparison. Recently, however, it has become fashionable for compilers to treat integer overflow--even in formerly-benign situations like the above--in ways that can cause arbitrary side-effects on other code. – supercat Aug 11 '16 at 14:57
  • Related: [Why does C++ standard specify signed integer be cast to unsigned in binary operations with mixed signedness?](http://stackoverflow.com/q/43336218/86967) – Brent Bradburn Apr 11 '17 at 03:44
  • Whether or not these, and related, operations are "undefined behavior" is, um, complicated: [signed overflow is UB](https://stackoverflow.com/a/17205195/86967), [signed overflow is UB](https://stackoverflow.com/q/16188263/86967). When talking about UB, it is necessary to be precise about the details of the operation. – Brent Bradburn Feb 29 '20 at 18:05
  • For shift-left it is even more confusing (or version dependent): https://stackoverflow.com/q/3784996 – Brent Bradburn Feb 29 '20 at 18:20
  • Is this undefined behavior? [unsigned long cannot hold the correct number over 2,147,483,647](https://stackoverflow.com/q/60455240/86967) – Brent Bradburn Feb 29 '20 at 18:44