difference of unsigned integer - standard supported way to get signed result?

Question

assuming two arbitrary timestamps:

uint32_t timestamp1;    
uint32_t timestamp2;

Is there a standard conform way to get a signed difference of the two beside the obvious variants of converting into bigger signed type and the rather verbose if-else.

Beforehand it is not known which one is larger, but its known that the difference is not greater than max 20bit, so it will fit into 32 bit signed.

int32_t difference = (int32_t)( (int64_t)timestamp1 - (int64_t)timestamp2 );

This variant has the disadvantage that using 64bit arithmetic may not be supported by hardware and is possible of course only if a larger type exists (what if the timestamp already is 64bit).

The other version

int32_t difference;
if (timestamp1 > timestamp2) {
  difference =    (int32_t)(timestamp1 - timestamp2);
} else {
  difference = - ((int32_t)(timestamp2 - timestamp1));
}

is quite verbose and involves conditional jumps.

That is with

int32_t difference = (int32_t)(timestamp1 - timestamp2);

Is this guaranteed to work from standards perspective?

Given your known limit, `(int32_t) (timestamp1 + 1048576 - timestamp2) - 1048576` is guaranteed to compute the difference without overflow, and Apple LLVM 10.0.1 with Clang 1001.0.46.4 compiles it to a single `subl` instruction for x86_64. — Eric Postpischil, Jul 02 '19 at 14:19
Related: https://stackoverflow.com/questions/31967370/is-detecting-unsigned-wraparound-via-cast-to-signed-undefined-behavior — dbush, Jul 02 '19 at 14:22
More general question, with timestamps that are allowed to wrap around 0: [https://stackoverflow.com/questions/58720505/] — personal_cloud, Nov 06 '19 at 00:36

Bathsheba · Accepted Answer · 2019-07-02T15:08:42.377

8

You can use a union type pun based on

typedef union
{
    int32_t _signed;
    uint32_t _unsigned;
} u;

Perform the calculation in unsigned arithmetic, assign the result to the _unsigned member, then read the _signed member of the union as the result:

u result {._unsigned = timestamp1 - timestamp2};
result._signed; // yields the result

This is portable to any platform that implements the fixed width types upon which we are relying (they don't need to). 2's complement is guaranteed for the signed member and, at the "machine" level, 2's complement signed arithmetic is indistinguishable from unsigned arithmetic. There's no conversion or memcpy-type overhead here: a good compiler will compile out what's essentially standardese syntactic sugar.

(Note that this is undefined behaviour in C++.)

edited Jul 02 '19 at 15:08

answered Jul 02 '19 at 14:13

Bathsheba

231,907
34
361
483

Use memcpy. As you say, a good compiler will optimize it away and simply recognize the type pun. Works in both C and C++. – Ben Voigt Jul 02 '19 at 14:26
1

@Bathsheba: I do not think there is an elegant solution in strictly conforming C. Reinterpreting via union or `memcpy` is likely about the best one can do. – Eric Postpischil Jul 02 '19 at 14:33
@EricPostpischil: In this case, I believe that type punning via pointer is also legal (signed and unsigned variations on the same type are pointer-interconvertible) – Ben Voigt Jul 02 '19 at 14:34
1

@BenVoigt: It is not the pointer interconvertibility that is important so much as that the aliasing rules in C 2018 6.5 7 explicitly allow aliasing corresponding signed and unsigned types. (By itself, the fact that converting pointers as in `unsigned int y; … int *x = (int *) &y;` is legal does not mean `*x` would be legal.) – Eric Postpischil Jul 02 '19 at 14:37
1

The `timestamp1` and `timestamp2` variables could be left as `uint32_t` in your code. Only the `result` variable needs to be a union. – Ian Abbott Jul 02 '19 at 15:02
@IanAbbott: Thanks, yes that's rather obvious and reads much better. – Bathsheba Jul 02 '19 at 15:03
1

Here's a macro definition along the same lines: `#define UTOS32(a) = ((union { uint32_t u; int32_t i; }){ .u = (a) }.i)`. `UTOS32(timestamp1 - timestamp2)` yields the result. – Ian Abbott Jul 02 '19 at 15:37
all this solutions rely on the fact that the binary representation of the wrapped-around uint-subtraction is the same as the signed subtraction result would be. So is not the esentially the same as casting the difference of the uint-subtraction (my last piece of code) – vlad_tepesch Jul 02 '19 at 18:23
@vlad_tepesch: Yes it is. My solution is centred around writing standard portable C. The compiler will exploit your arguments in its optimisations. This answer has survived inspection by Eric Postpischil: if they have not voiced reservations that's good enough for me! – Bathsheba Jul 02 '19 at 18:31
@Bathsheba but since the standard does not dictate the two's complement, I do not understand why this way should be more standard and portable than the cast. is not reading another union member than the last written one UB (except an unsigned char array)? – vlad_tepesch Jul 02 '19 at 20:35
1

Because the fixed width types, if present, must behave in this manner. I don’t see how I can make that clearer in the answer? – Bathsheba Jul 03 '19 at 06:00
@Bathsheba: Quality implementations will allow code to access unions using member lvalues, but the way N1570 6.5p7 is written doesn't actually require that they do so for non-character types. The authors of the Standard probably thought it obvious that quality compilers should make a reasonable effort to recognize cases where an lvalue of one type is used to derive an lvalue of another, and accommodate them in cases where their customers would require that, but the maintainers of clang and gcc are hostile to such notions. – supercat Jul 08 '19 at 22:27
Can someone explain why OP's suggestion: `int32_t difference = (int32_t)(timestamp1 - timestamp2);` won't do the job? `unsigned` overflow is well defined, than casting it to signed type will make sure conversion to proper negative binary representation of the machine. Is there something I'm missing? – izac89 Jul 10 '19 at 17:47
@user2162550 That’s undefined behaviour due to signed type overflow, even for the 2’s complement fixed width type. – Bathsheba Jul 10 '19 at 18:06
@Bathsheba where is the signed type overflow? I see a potential unsigned overflow, then cast. Do you mean the cast can result in an overflow? – izac89 Jul 10 '19 at 18:10
@user2162550 Exactly! – Bathsheba Jul 10 '19 at 18:11
How can casting uint32_t to int32_t cause overflow? It is news to me – izac89 Jul 10 '19 at 18:14
@user2162550 Well the unsigned type can store larger values than the signed one. Sorry to be the bearer of bad news. This is why this question is not trivial. – Bathsheba Jul 10 '19 at 18:15
But is it not well defined that, for example, casting `0xffffffff` to int32_t will result in `-1` ? – izac89 Jul 10 '19 at 18:17
@user2162550 absolutely not. Note this question is asked on the lawyer tag so you can’t say ‘well it will in practice’ on all current desktop platforms. – Bathsheba Jul 10 '19 at 18:18
I’m on the choo choo so can’t. Just google signed integral type overflow undefined behaviour. – Bathsheba Jul 10 '19 at 18:20
1

For anyone who struggled with this answer, like me, this is helpful: [7.20.1.1 Exact-width integer types] "The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two's complement representation" so this explains the promised 2's complement representation. And regarding the OP's answer which contains UB, it is explained in section [6.3.1.3 Signed and unsigned integers] and in @dbush answer. – izac89 Jul 10 '19 at 19:07
@user2162550: Feel free to answer on those terms - include any bits of my answer at your leisure. I'd be sure to upvote - ping me. – Bathsheba Jul 11 '19 at 07:39
`(int32_t)(timestamp1 - timestamp2)` is implementation-defined , not undefined . There is no overflow since the subtraction occurs in an unsigned type which cannot overflow by definition. – M.M Nov 06 '19 at 00:13
Can you provide a reference for "2's complement is guaranteed for the signed member"? I thought the C standard allowed 1's complement representation for `int`. – personal_cloud Nov 06 '19 at 00:45
@personal_cloud: It does indeed. My answer is centered on the fact that a 2's complement arithmetic has the same operation at the bit level as unsigned arithmetic. – Bathsheba Nov 06 '19 at 14:36

score 1 · Answer 2 · answered Nov 06 '19 at 00:19

1

Bathsheba's answer is correct but for completeness here are two more ways (which happen to work in C++ as well):

uint32_t u_diff = timestamp1 - timestamp2;
int32_t difference;
memcpy(&difference, &u_diff, sizeof difference);

and

uint32_t u_diff = timestamp1 - timestamp2;
int32_t difference = *(int32_t *)&u_diff;

The latter is not a strict aliasing violation because that rule explicitly allows punning between signed and unsigned versions of an integer type.

The suggestion:

int32_t difference = (int32_t)(timestamp1 - timestamp2);

will work on any actual machine that exists and offers the int32_t type, but technically is not guaranteed by the standard (the result is implementation-defined).

answered Nov 06 '19 at 00:19

M.M

138,810
21
208
365

As OP has commented on Bathsheba's answer, there is no documentation to support that `int32_t` has to be two's complement, nor that a type pun is any more portable than a regular typecast. So your comment "technically is not guaranteed by the standard" actually applies to all 3 solutions in this answer. – personal_cloud Nov 06 '19 at 02:04
@personal_cloud Huh? It's in the C Standard. See C11 7.20.1.1/1 – M.M Nov 06 '19 at 04:21
Oh, I see. Looks like this goes back to at least C99 too. So the *representation* of `int32_t` is specified, but the cast from `uint32_t` is not. So yes, I would accept that in [my more general question](https://stackoverflow.com/questions/58720505/) as well. – personal_cloud Nov 06 '19 at 23:51

score 0 · Answer 3 · answered Jul 02 '19 at 15:25

The conversion of an unsigned integer value to a signed integer is implementation defined. This is spelled out in section 6.3.1.3 of the C standard regarding integer conversions:

1 When a value with integer type is converted to another integer type other than _Bool ,if the value can be represented by the new type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 60)

3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

On implementations people are most likely to use, the conversion will occur the way you expect, i.e. the representation of the unsigned value will be reinterpreted as a signed value.

Specifically GCC does the following:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

MSVC:

When a long integer is cast to a short, or a short is cast to a char, the least-significant bytes are retained.

For example, this line
short x = (short)0x12345678L;
assigns the value 0x5678 to x, and this line
char y = (char)0x1234;
assigns the value 0x34 to y.

When signed variables are converted to unsigned and vice versa, the bit patterns remain the same. For example, casting -2 (0xFE) to an unsigned value yields 254 (also 0xFE).

So for these implementations, what you proposed will work.

This question is tagged language-lawyer and explicitly asks for a standard conforming solution. (Not my down vote.) — Eric Postpischil, Jul 02 '19 at 15:27

personal_cloud · Answer 4 · 2019-11-07T20:27:10.253

Rebranding Ian Abbott's macro-packaging of Bathseba's answer as an answer:

#define UTOS32(a) ((union { uint32_t u; int32_t i; }){ .u = (a) }.i)

int32_t difference = UTOS32(timestamp1 - timestamp2);

Summarizing the discussions on why this is more portable than a simple typecast: The C standard (back to C99, at least) specifies the representation of int32_t (it must be two's complement), but not in all cases how it should be cast from uint32_t.

Finally, note that Ian's macro, Bathseba's answer, and M.M's answers all also work in the more general case where the counters are allowed to wrap around 0, as is the case, for example, with TCP sequence numbers.

difference of unsigned integer - standard supported way to get signed result?

4 Answers4

Linked