50

If the representation of a long int and a int are the same on a platform, are they strictly the same? Do the types behave any differently on the platform in any way according to the C standard?

Eg. does this always work:

int int_var;
long long_var;

void long_bar(long *l);
void int_bar(int *i);

void foo() 
{
    long_bar(&int_var); /* Always OK? */
    int_bar(&long_var);
}

I guess the same question applies to short and int, if they happen to be the same representation.

The question arose when discussing how to define a int32_t-like typedef for an embedded C89 compiler without stdint.h, i.e. as int or long and if it would matter.

user3840170
  • 26,597
  • 4
  • 30
  • 62
Vilhelm
  • 717
  • 1
  • 7
  • 12
  • 24
    Different types are different, even if their bitwise representation is the same. What you're doing breaks *strict aliasing*. – Some programmer dude Mar 29 '21 at 07:47
  • 2
    Do you ask your question with a language lawyer perspective (then read [n1570](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) or better....) or do you care *in practice* - then with what compiler and target processor? If your compiler is [GCC](http://gcc.gnu.org/) use it as `gcc -Wall -Wextra` – Basile Starynkevitch Mar 29 '21 at 07:55
  • Ye well from a practial perspective how to define a 32bit typedef for a platform without stdint.h. I edited the question. As in "does it matter in any way?". Language laywering is insteresting too, though, I'll read that. – Vilhelm Mar 29 '21 at 08:05
  • BTW, recent [GCC](http://gcc.gnu.org/) compilers are good embedded cross-compilers for newer versions of C. Consider asking permissions to use them. See also the [DECODER](https://www.decoder-project.eu/) project (and contact me by email to `basile.starynkevitch@cea.fr` about it) – Basile Starynkevitch Mar 29 '21 at 08:05
  • Are you *coding* that C89 cross-compiler, or did you bought it and just are *using* it? Are you allowed to switch to some better cross-compiler? What is the target processor? – Basile Starynkevitch Mar 29 '21 at 08:21
  • "define a "uint32_t"-like typedef for an embedded C89 compiler without stdint.h, i.e. as "int" or "long" and if it would matter." is amiss on its face. `uint32_t` should never get defined as a _signed_ type. Post adjusted. – chux - Reinstate Monica Mar 29 '21 at 11:55
  • 9
    even if their sizes are the same, they may still have different range because C allows types to have trap representations and padding bits. For example `int` can have 0x80000000 as a trap but `long` can store it normally. Or `int` can have 2 padding bits but `long` has only 1 – phuclv Mar 29 '21 at 12:16
  • 3
    another example is `char`, `signed char`, `unsigned char` are 3 distinct types with the same size but their ranges are also different from `_Bool` which is another type typically having the same size – phuclv Mar 29 '21 at 12:26
  • `struct databaseid { int id; };` also has the same representation as `int`. – Mooing Duck Mar 29 '21 at 20:56
  • @BasileStarynkevitch: The optimizers in clang and gcc make assumptions that would be inappropriate for many embedded systems programming tasks, and so far as I can tell, the only way to disable such phony breaking "optimizations" is to disable optimization entirely. – supercat Mar 29 '21 at 23:44
  • 1
    With GCC, you can code your own GCC plugin to change its behavior. And both GCC and Clang are opensource compilers, that you could improve – Basile Starynkevitch Mar 30 '21 at 05:10
  • This relates to my question, whose answers you might find interesting: https://stackoverflow.com/q/16138237/541686 – user541686 Mar 30 '21 at 12:23
  • Questions like these make me so glad that I program in languages that don't have any of this pain. – Ian Kemp Mar 30 '21 at 17:11

3 Answers3

41

They are not compatible types, which you can see with a a simple example:

int* iptr;
long* lptr = iptr; // compiler error here

So it mostly matters when dealing with pointers to these types. Similarly, there is the "strict aliasing rule" which makes this code undefined behavior:

int i;
long* lptr = (long*)&i;
*lptr = ...;  // undefined behavior

Some another subtle issue is implicit promotion. In case you have some_int + some_long then the resulting type of that expression is long. Or in case either parameter is unsigned, unsigned long. This is because of integer promotion through the usual arithmetic conversions, see Implicit type promotion rules. Shouldn't matter most of the time, but code such as this will fail: _Generic(some_int + some_long, int: stuff() ) since there is no long clause in the expression.

Generally, when assigning values between types, there shouldn't be any problems. In case of uint32_t, it doesn't matter which type it corresponds to, because you should treat uint32_t as a separate type anyway. I'd pick long for compatibility with small microcontrollers, where typedef unsigned int uint32_t; will break. (And obviously, typedef signed long int32_t; for the signed equivalent.)

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • @chux-ReinstateMonica: There's no `long` entry in `_Generic(some_int + some_long, int: stuff() )`. – user2357112 Mar 29 '21 at 12:15
  • @chux-ReinstateMonica What has signedness to do with anything? `unsigned int` is 16 bit on small MCUs. So if you pick `unsigned int` for `uint32_t` you define an `uint32_t` with 16 bits. – Lundin Mar 29 '21 at 13:06
  • I'm probably being obtuse, but -- it seems like your "implicit promotion" example isn't really about implicit promotion; you'd get the same issue with just `_Generic(some_long, int: stuff() )` or `_Generic(some_int, long: stuff() )`. – ruakh Mar 29 '21 at 18:45
  • @ruakh The point is that `_Generic(some_int + some_int, int: stuff() )` would have worked. By introducing a `long` in an expression subject to usual arithmetic conversion, I changed the type to `long`, even if `long` happens to have the same size as `int`. – Lundin Mar 30 '21 at 13:56
  • Similarly, consider `_Generic( ... , int: stuff(), int32_t: stuff)` vs `_Generic( ... , long: stuff(), int: stuff)`. The former will yield a compiler error when `int32_t` is actually the same type as `int`. But the latter won't, since they are distinctive types of that same size. – Lundin Mar 30 '21 at 14:02
  • @Lundin: Yes, exactly. The important point is that _Generic regards them as distinct types. The bit about implicit promotion would be meaningless if they weren't distinct types, and has no interesting consequences on its own (or if it does then you haven't mentioned them). – ruakh Mar 30 '21 at 16:43
15

The types long and int have different ranks. The rank of the type long is higher than the rank of the type int. So in a binary expression where there are used an object of the type long and an object of the type int the last is always converted to the type long.

Compare the following code snippets.

int x = 0;
unsigned int y = 0;

the type of the expression x + y is unsigned int.

long x = 0;
unsigned int y = 0;

the type of the expression x + y is unsigned long (due to the usual arithmetic conversions) provided that sizeof( int ) is equal to sizeof( long).

This is very important in C++ than in C where function overloading are allowed.

In C you have to take this into account for example when you are using i/o functions as for example printf to specify a correct conversion specifier.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
5

Even on platforms where long and int have the same representation, the Standard would allow compilers to be willfully blind to the possibility that the act of storing a value to a long* might affect the value of an int* or vice versa. Given something like:

#include <stdint.h>

void store_to_int32(void *p, int index)
{
    ((int32_t*)p)[index] = 2;
}
int array1[10];
int test1(int index)
{
    array1[0] = 1;
    store_to_int32(array1, index);
    return array1[0];
}
long array2[10];
long test2(int index)
{
    array2[0] = 1;
    store_to_int32(array2, index);
    return array2[0];
}

The 32-bit ARM version of gcc will treat int32_t as synonymous with long and ignore the possibility that passing the address of to array1 to store_to_int32 might cause the first element of that array to be written, and the 32-bit version of clang will treat int32_t as synonymous with int and ignore the possibility that passing the address of array2 to store_to_int32 might cause that array's first element to be written.

To be sure, nothing in the Standard would prohibit compilers from behaving in that fashion, but I think the Standard's failure to prohibit such blindness stems from the principle "the dumber something would be, the less need there should be to prohibit it".

supercat
  • 77,689
  • 9
  • 166
  • 211