6

I've done some quick tests that a signed int to unsigned int cast in C does not change the bit values (on an online debugger).

What I want to know is whether it is guaranteed by a C standard or just the common (but not 100% sure) behaviour ?

Guillaume Petitjean
  • 2,408
  • 1
  • 21
  • 47
  • 2
    It changes the *context*, that is, what the value *means*. All computer data consists of a limited set of numbers, which have meaning in context. However it is rather different when casting, say, `int` to `float`, when the representation will change. – Weather Vane Oct 16 '19 at 14:33
  • Only an assignment would change a value. Casting from signed (2s-complement) to unsigned is an unsafe cast, and I think, implementation specific. – LegendofPedro Oct 16 '19 at 14:35
  • Possible duplicate? https://stackoverflow.com/questions/50605/signed-to-unsigned-conversion-in-c-is-it-always-safe – Adrian Mole Oct 16 '19 at 14:38
  • I've read this post @Adrian but it is about explicit conversion – Guillaume Petitjean Oct 16 '19 at 14:39
  • 1
    @LegendofPedro it would be safe with two `int` values which are known to be non-negative, and you want their sum without danger of `int` overflow. – Weather Vane Oct 16 '19 at 14:40
  • 2
    @WeatherVane (and LegendofPedro): The cast is perfectly well-defined, without danger of overflow, regardless of the sign of the signed int going in. – Steve Summit Oct 16 '19 at 14:50
  • @SteveSummit it might be well-defined, but is it useful? How does `(unsigned)-1` not "overflow" in the usual sense of the word? – Weather Vane Oct 16 '19 at 15:45
  • @WeatherVane Ah, okay, I see your point. To me it's "useful" in that it enables a bunch of shortcuts, although I guess those shortcuts are also arguably bad habits. See also Antti Haapala's answer. – Steve Summit Oct 16 '19 at 16:09
  • @SteveSummit, I have to copy a signed int variable into an element of an array of unsigned int (buffer for transmission on a serial-like port). the data carried by this buffer can have different type and is interpreted by the destination. I know it sounds pretty obvious and I did this kind of stuff hundreds of times but when I started to wonder what is the best way to go, it ended being not so obvious after all :) – Guillaume Petitjean Oct 17 '19 at 07:39
  • @GuillaumePetitjean Treating data as unsigned for the purposes of data transmissions is one of the "shortcuts" I was referring to in my answer to Weather Vane. Personally I think it's a fine technique, although I suppose there's a question of whether, on the other end, that unsigned int is guaranteed to be convertible back to the original signed int, especially if it had been negative. – Steve Summit Oct 17 '19 at 10:52
  • @SteveSummit, yes there are probably better ways to do that, like using unions of byte arrays and structures but this is how the software i'm working on is designed. – Guillaume Petitjean Oct 17 '19 at 11:34

2 Answers2

7

Conversion from signed int to unsigned int does not change the bit representation in two’s-complement C implementations, which are the most common, but will change the bit representation for negative numbers, including possible negative zeroes on one’s complement or sign-and-magnitude systems.

This is because the cast (unsigned int) a is not defined to retain the bits but the result is the positive remainder of dividing a by UINT_MAX + 1 (or as the C standard (C11 6.3.1.3p2) says,

the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

The two’s complement representation for negative numbers is the most commonly used representation for signed numbers exactly because it has this property of negative value n mapping to the same bit pattern as the mathematical value n + UINT_MAX + 1 – it makes it possible to use the same machine instruction for signed and unsigned addition, and the negative numbers will work because of wraparound.

2

Casting from a signed to an unsigned integer is required to generate the correct arithmetic result (the same number), modulo the size of the unsigned integer, so to speak. That is, after

int i = anything;
unsigned int u = (unsigned int)i;

and on a machine with 32-bit ints, the requirement is that u is equal to i, modulo 232.

(We could also try to say that u receives the value i % 0x100000000, except it turns out that's not quite right, because the C rules say that when you divide a negative integer by a positive integer, you get a quotient rounded towards 0 and a negative remainder, which isn't the kind of modulus we want here.)

If i is 0 or positive, it's not hard to see that u will have the same bit pattern. If i is negative, and if you're on a 2's complement machine, it turns out the result is also guaranteed to have the same bit pattern. (I'd love to present a nice proof of that result here, but I don't have time just now to try to construct it.)

The vast majority of today's machines use 2's complement. But if you were on a 1's complement or sign/magnitude machine, I'm pretty sure the bit patterns would not always be the same.

So, bottom line, the sameness of the bit patterns is not guaranteed by the C Standard, but arises due to a combination of the C Standard's requirements, and the particulars of 2's complement arithmetic.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • 1
    You should express the rules in terms of a two’s complement *C implementation*, not a two’s complement *machine*. While C is generally intended to use features matching its target processor, the implementation has the ultimate say. If I write a C implementation with one’s complement to support some ancient software I want to display in a computer museum, it is going to obey the one’s complement rules even if the underlying machine is two’s complement. – Eric Postpischil Oct 17 '19 at 00:08
  • @EricPostpischil are you aware of existing one's complement _machines_ ? – Guillaume Petitjean Oct 17 '19 at 07:35
  • @GuillaumePetitjean: As I just stated, C may be implemented via software rather than hardware. Thus, the underlying machine is not the determining factor. So the existence of machines of any particular type is irrelevant. The rules stated in the C standard are clear: The behavior is defined by the C implementation, not by the hardware. – Eric Postpischil Oct 17 '19 at 10:25
  • I fully understoodd that @EricPostpischil. But I guess in practice it doesn't make sense to choose a 1's complement C implementation on a 2's complement machine. Just curious. – Guillaume Petitjean Oct 17 '19 at 11:36
  • @GuillaumePetitjean: Stating the behavior of C integers is determined by the machine is **false**. Stating the behavior of C integers is determined by the C implementation is **true**. It is that simple. One is true, the other is false, teaching students false statements is bad, and relying on false statements sometimes leads to errors in unexpected ways. C compilers have become increasingly aggressive about optimization, taking advantage of rules in the C standard even if they are not a direct outcome of the target hardware. There is no reason to make an incorrect statement here. – Eric Postpischil Oct 17 '19 at 11:47
  • Really, the mindset of “I know what the truth is, but this works. For me. In the circumstances I am familiar with. With current systems, neglecting any future possibilities. With current software, neglecting any future possibilities. Therefore, I will disregard the known truth and teach a falsehood.” is incomprehensible. Nobody disputes the C standard says the behavior is defined by the **implementation** not by the **machine**. So there is no justification for stating otherwise. Teach the truth. – Eric Postpischil Oct 17 '19 at 11:50
  • What the f... ??? I didn't state anything, I just asked a simple question. What's the big deal ? – Guillaume Petitjean Oct 17 '19 at 12:02
  • When I wrote "are you aware of existing one's complement machines ?" there was no irony whatsoever, it was a simple and real question – Guillaume Petitjean Oct 17 '19 at 12:04
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/201021/discussion-between-guillaume-petitjean-and-eric-postpischil). – Guillaume Petitjean Oct 17 '19 at 12:29
  • @GuillaumePetitjean: I have no interest in discussing the existence, non-existence, or prevalence of one’s complement machines. It is not relevant to the fact that the C standard states the choice of two’s complement, one’s complement, or sign and magnitude is implementation-defined. If you want to know about them, you can start with the [Wikipedia article on one’s complement](https://en.wikipedia.org/wiki/Ones%27_complement), which mentions both historic and current systems. – Eric Postpischil Oct 17 '19 at 12:56
  • @GuillaumePetitjean *it doesn't make sense to choose a 1's complement C implementation on a 2's complement machine*. Yes and no. Various kinds of emulators and virtual machines are increasingly common. I myself have contemplated writing an emulator to let me play with 1's complement and sign/magnitude arithmetic. So I see Eric's point, although I'm not as worried about the wording I used as he is. (Among other things, there's probably not a bright line between am emulator you can write in software, and a virtual *machine*.) – Steve Summit Oct 17 '19 at 13:52
  • I didn't take your "are you aware of any?" question as ironic, but sometimes I do, because that kind of question is quite often a sign that someone is trying to rationalize some unnecessarily nonportable, machine-dependent (sorry, implementation-dependent) programming practice or another. Those questions are a hot button for those of us who rail against such rationalization, so I quite understand @EricPostpischil 's reaction. – Steve Summit Oct 17 '19 at 13:54
  • A lot of noise for nothing. And avoid witchunt please. – Guillaume Petitjean Oct 17 '19 at 14:00