4

As stated in the title, I have come across a very strange thing regarding explicit vs. implicit typecasting on GCC on Linux.

I have the following simple code to demonstrate the problem:

#include <stdio.h>
#include <stdint.h>

int main(void)
{
   int i;
   uint32_t e1 = 0;
   uint32_t e2 = 0;
   const float p = 27.7777;

   printf("#   e1 (unsigned)   e1 (signed)       e2 (unsigned)   e2 (signed)\n");
   for (i = 0; i < 10; i++) {
      printf("%d   %13u   %11d       %13u   %11d\n", i, e1, e1, e2, e2);
      e1 -= (int)p;
      e2 -= p;
   }
   return 0;
}

As you can see e1 is decremented by p explicitly typecasted to an int, while e2 is decremented by p implicitly typecasted.

I expected that e1 and e2 would contain the same value, but they do not... And in fact, it looks like the result is system dependent.

For testing the code, I have two virtual machines (VirtualBox started with Vagrant). Here is the first machine:

vagrant@vagrant:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial
vagrant@vagrant:~$ uname -a
Linux vagrant 4.4.0-92-generic #115-Ubuntu SMP Thu Aug 10 16:02:55 UTC 2017 i686 i686 i686 GNU/Linux
vagrant@vagrant:~$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

To build and execute, I use the following

vagrant@vagrant:~$ gcc -Wall /vagrant/test.c
vagrant@vagrant:~$ ./a.out
#   e1 (unsigned)   e1 (signed)       e2 (unsigned)   e2 (signed)
0               0             0                   0             0
1      4294967269           -27          4294967269           -27
2      4294967242           -54          4294967268           -28
3      4294967215           -81          4294967268           -28
4      4294967188          -108          4294967268           -28
5      4294967161          -135          4294967268           -28
6      4294967134          -162          4294967268           -28
7      4294967107          -189          4294967268           -28
8      4294967080          -216          4294967268           -28
9      4294967053          -243          4294967268           -28
vagrant@vagrant:~$

As you can see everything looks fine for e1 which was the one using explicit typecasting, but for e2 the result is quite strange...

Then I try the same on another virtual machine, which is a 64-bit version of Ubuntu:

vagrant@ubuntu-xenial:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:        16.04
Codename:       xenial
vagrant@ubuntu-xenial:~$ uname -a
Linux ubuntu-xenial 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
vagrant@ubuntu-xenial:~$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

And here is the output of the application:

vagrant@ubuntu-xenial:~$ gcc -Wall /vagrant/test.c
vagrant@ubuntu-xenial:~$ ./a.out
#   e1 (unsigned)   e1 (signed)       e2 (unsigned)   e2 (signed)
0               0             0                   0             0
1      4294967269           -27          4294967269           -27
2      4294967242           -54                   0             0
3      4294967215           -81          4294967269           -27
4      4294967188          -108                   0             0
5      4294967161          -135          4294967269           -27
6      4294967134          -162                   0             0
7      4294967107          -189          4294967269           -27
8      4294967080          -216                   0             0
9      4294967053          -243          4294967269           -27

The values of e2 are still not what I expected, but now different from what I had on my 32-bit system.

I have no idea if this difference is caused by the 32-bits vs. 64-bits, or if it is related to something else.

However, I would like to understand why there is a difference on e1 and e2, and if it is possible to at least get a warning from GCC when this happens.

Thanks :-)

Tue Henriksen
  • 63
  • 1
  • 5
  • Looking at the assembly generated out of the two cases, it seems that in the case of explicit cast, `p` is cast into an integer and then subtracted from `e1`. In the case of implicit cast, `e2` is first cast into floating point, then `p` is subtracted, and the result is then cast into an integer. Maybe someone can point out the place where this is specified in the standard...? As it is, output from a Visual C compiled program is similar to gcc. – jlahd Jan 30 '18 at 14:06
  • If you replace `e2 -= p;` with `e2 -= (uint32_t)p;` then both `e2` and `e1` track the same. You have the question: *In the expression `e2 -= p;` what type casting is occuring for the operation?* According to the rules of conversion, this behaves like `e2 = e2 - p` which would cast `e2` to a `float` first, do the subtraction, then cast the result back to `uint32_t`. This then leaves the question: *What happens to `e2` when it's converted to a float?*. `4294967269` is a large int and, in the process of going to single precision `float` loses significant digit precision. Thus, the odd results. – lurker Jan 30 '18 at 14:13
  • 1
    Nomenclature: there are implicit _conversions_ and explicit _conversions_. A _cast_ is the action where a human programmer forces an explicit conversion on purpose, by using the cast operator. – Lundin Jan 30 '18 at 14:28
  • Anyway, this seems like a somewhat common FAQ. I've added the linked post as a canonical duplicate to the SO [C FAQ](https://stackoverflow.com/tags/c/info). – Lundin Jan 30 '18 at 14:35

1 Answers1

4

The statement e2 -= p is equivalent to e2 = e2 - p. And because e2 - p is a subtraction involving a float (e2 = unsigned_val - float_val), this expression evaluates to a float, which means you are actually assigning a float value to an unsigned integer variable.

The other statement e1 -= (int) p is not equivalent, since what happens here is e1 = unsigned_val - int_val, so the value assigned to e1 is an integer rather than floating-point.

Since assigning a floating-point value to an integer where the value is out of range for the integer's type is undefined behaviour (see 6.3.1.4 Real floating and integer of the C standard), you are experiencing differing behaviour across different platforms with your assignment to e2.

Candy Gumdrop
  • 2,745
  • 1
  • 14
  • 16
  • Thanks for the explanation! It makes sense now, and I can see that it will not be possible to get a warning for this situation. – Tue Henriksen Jan 30 '18 at 14:34