Assembly / CPU arch: is -INT_MIN always equal to INT_MIN?

Question

Firstly, sorry for abusing C's terminology "INT_MIN" when I talk about assembly programming. But let me continue…

For example, on i386 or x86_64 Linux, a C function int neg(int x) { return -x; } would typically translate to movl %edi, %eax; negl %eax; ret. Since x86 neg is defined as taking two's complement of the operand (see Intel64 reference manual for the details) f applied to INT_MIN whose bit representation is 0x80000000u returns a number whose bit representation is 0x7fffffffu + 1 == 0x80000000u, or INT_MIN, without raising any exceptions/traps.

My question is, are there any (modern) CPU architectures whose most fundamental "negation" instruction doesn't do the same? ARM, SPARC, MIPS, Power? Embedded CPUs? Do they raise any exceptions or fall into undefined behaviors? (By the way, I guess some of them only have a "subtraction" instruction.)

I'm just curious how portable this piece of code in ntpdate(8) is:

https://github.com/ntp-project/ntp/compare/18762a8...c4c256e

(Roughly speaking, it conceptually does dostep = (NTPDATE_THRESHOLD <= abs(server->soffset)); by assuming abs(INT_MIN) == INT_MIN due to the above fact.)

While technically that could work, I believe the C standard specifies that as undefined behavior which allows the compiler to generate whatever code it wants, so that program is wrong (if it really does what you say, which is not obvious at a first glance). Note that for example gcc is getting more and more aggressive in (ab)using undefined behavior. — Jester, Jan 30 '16 at 23:15
The code is C, not assembly, and the behavior is undefined in C, so compilers are permitted to (and some do) optimize on the assumption that `server->soffset` is never `INT_MIN` — Raymond Chen, Jan 30 '16 at 23:19
I believe you would get a trap when using `sub $v0, $zero, $a0` for `$a0 == INT_MIN` MIPS. — EOF, Jan 30 '16 at 23:36
@EOF Thanks, I was looking for such information. Yeah, the hack is fishy on MIPS (and Alpha, according to some internet search) but at least my Clang 3.7 prefers to use `subu` and `negu` which don't trap. — nodakai, Jan 31 '16 at 03:42
@nodakai: gcc has a `-ftrapv` option, which should make it use the trapping signed instructions. — ninjalj, Jan 31 '16 at 21:27

5gon12eder · Answer 1 · 2016-01-31T21:14:26.173

Let's look at the code in question (reduced to a minimal example).

bool
dostep(int32_t absoffset)
{
  if (absoffset < 0)
    absoffset = -absoffset;
  return (absoffset >= NTPDATE_THRESHOLD || absoffset < 0);
}

It is clear that the expression absoffset < 0 that is the second operand of the || can never be true without overflow. But integer overflow is undefined behavior in C. Therefore, a compiler is perfectly allowed to optimize the check away.

It doesn't matter how the machine would handle an integer overflow if it would execute the instruction. If you program in C, you don't program the hardware, you program the abstract machine defined by the C standard. It is good to know how the real hardware works to reason about performance. It is dangerous, however, to make assumptions about code that invokes undefined behavior based on expectations how the compiler would translate the broken code into machine code. The compiler is allowed to generate any code as long as it makes the real machine behave as the standard mandates it for the abstract machine. And since the standard specifically says nothing about undefined behavior, no assumptions must be made for this case. To put it bluntly: The NTP code is broken.

One possible fix would be to re-arrange the check like this.

bool
dostep(int32_t absoffset)
{
  if (absoffset < 0)
    absoffset = (absoffset == INT32_MIN) ? INT32_MAX : -absoffset;
  return (absoffset >= NTPDATE_THRESHOLD);
}

In this particular case, however, there exists an even simpler solution.

bool
dostep(const int32_t absoffset)
{
  return ((absoffset <= -NTPDATE_THRESHOLD) || (absoffset >= NTPDATE_THRESHOLD));
}

Another option would be to use inline assembly that is not subject to the rules of the C standard and is obviously not portable.

To be fair, an implementation may provide its own extensions to the C standard in order to make undefined behavior defined. GCC and Clang do this for integer overflow by providing the -fwrapv flag. From GCC's man page:

-fwrapv

This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables others. This option is enabled by default for the Java front end, as required by the Java language specification.

If the NTP code were compiled with GCC and this flag and the target hardware would implement signed integer overflow to wrap around, then the code would be correct. But this is no longer portable standard C.

"bool dostep(int32_t absoffset) ..." Sorry, what is `bool`? Your "fix" wouldn't even _compile_ with a C99-conforming compiler, let alone portability :-) — nodakai, Jan 31 '16 at 03:00
Also, the NTP code tries to be portable to platforms without C99 `intNN_t` or even `long long`. Assuming we have GCC and such a hardware like you described is far too much. Actually `ntpdate` is merely a second-class citizen in the NTP package and is known to have so many problems that the NTP team even deprecated it. That's one of the reasons why the above questionable piece of code survived until today. — nodakai, Jan 31 '16 at 03:19
C99 does have `` and `` which provides `bool` and `int32_t`. — 5gon12eder, Jan 31 '16 at 04:44
Shame on me, you are totally correct. I thought C99 comes only with `_Bool`. But anyways, ... — nodakai, Jan 31 '16 at 07:36

Assembly / CPU arch: is -INT_MIN always equal to INT_MIN?

1 Answers1