58

From an Example

unsigned long x = 12345678UL

We have always learnt that the compiler needs to see only "long" in the above example to set 4 bytes (in 32 bit) of memory. The question is why is should we use L/UL in long constants even after declaring it to be a long.

Shash
  • 4,160
  • 8
  • 43
  • 67

4 Answers4

94

When a suffix L or UL is not used, the compiler uses the first type that can contain the constant from a list (see details in C99 standard, clause 6.4.4:5. For a decimal constant, the list is int, long int, long long int).

As a consequence, most of the times, it is not necessary to use the suffix. It does not change the meaning of the program. It does not change the meaning of your example initialization of x for most architectures, although it would if you had chosen a number that could not be represented as a long long. See also codebauer's answer for an example where the U part of the suffix is necessary.


There are a couple of circumstances when the programmer may want to set the type of the constant explicitly. One example is when using a variadic function:

printf("%lld", 1LL); // correct, because 1LL has type long long
printf("%lld", 1);   // undefined behavior, because 1 has type int

A common reason to use a suffix is ensuring that the result of a computation doesn't overflow. Two examples are:

long x = 10000L * 4096L;
unsigned long long y = 1ULL << 36;

In both examples, without suffixes, the constants would have type int and the computation would be made as int. In each example this incurs a risk of overflow. Using the suffixes means that the computation will be done in a larger type instead, which has sufficient range for the result.

As Lightness Races in Orbit puts it, the litteral's suffix comes before the assignment. In the two examples above, simply declaring x as long and y as unsigned long long is not enough to prevent the overflow in the computation of the expressions assigned to them.


Another example is the comparison x < 12U where variable x has type int. Without the U suffix, the compiler types the constant 12 as an int, and the comparison is therefore a comparison of signed ints.

int x = -3;
printf("%d\n", x < 12); // prints 1 because it's true that -3 < 12

With the U suffix, the comparison becomes a comparison of unsigned ints. “Usual arithmetic conversions” mean that -3 is converted to a large unsigned int:

printf("%d\n", x < 12U); // prints 0 because (unsigned int)-3 is large

In fact, the type of a constant may even change the result of an arithmetic computation, again because of the way “usual arithmetic conversions” work.


Note that, for decimal constants, the list of types suggested by C99 does not contain unsigned long long. In C90, the list ended with the largest standardized unsigned integer type at the time (which was unsigned long). A consequence was that the meaning of some programs was changed by adding the standard type long long to C99: the same constant that was typed as unsigned long in C90 could now be typed as a signed long long instead. I believe this is the reason why in C99, it was decided not to have unsigned long long in the list of types for decimal constants. See this and this blog posts for an example.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 2
    Small addition: it can also improve readability and hint about the suggested usage in some cases. E.g. you might have something like `#define MY_DEFINE 123456789UL` and you use `MY_DEFINE` later in the code. Naturally, it doesn't have type associated with it so `UL` addition may be of little help here. – SomeWittyUsername Oct 30 '12 at 08:33
  • 3
    Even though the compiler can pick the size of a numeric literal, it doesn't automatically determine whether it's signed or not. For example, `18446744073709551615` is treated as `-1L` on systems with a 64-bit `long`. You have to explicitly use `UL`. – Nikos C. Oct 30 '12 at 08:37
  • @NikosChantziaras Perhaps at the same time you were writing your comment, I was expanding on the case of the list of types for decimal constants not containing any unsigned types, with a theory for the reason. – Pascal Cuoq Oct 30 '12 at 08:41
  • Another fairly common scenario where type suffixes are needed are bit shifts, `1 << 36` is probably UB, `1ULL << 36` is safe. Perhaps worth to be added in the list of examples. – Daniel Fischer Oct 30 '12 at 13:40
  • @DanielFischer I have grouped that with caf's multiplication example. – Pascal Cuoq Oct 30 '12 at 13:55
  • IMHO it would be justified to mention the `*_C` macros in `stdint.h` in this answer as well. It's already quite verbose ;) E.g. `UINT64_C(x)` produces a literal of value x with the right suffix to make its type `uint64_t` - thus there is no need for specific prefixes for the `stdint.h` data types. – stefanct Oct 26 '17 at 17:54
  • @stefanct You can put this information in your own answer if you think it is useful. Re-reading the question, I think it's a digression. There is nothing in the question that indicates that the OP wants to know about the `_C` macro. Thanks for your comment. – Pascal Cuoq Oct 26 '17 at 20:18
  • @pascal Cuoq: I am getting correct result for this: signed long long var3 = 2147483648+2; In this case both operands in RHS are ints and result is int and must overflow right? But result is 2147483650. This was using C++11. – Rajesh Jan 17 '18 at 01:18
  • @Rajesh No, `2147483648` is not an `int` – Pascal Cuoq Jan 17 '18 at 09:15
  • 1
    @pascal Cuoq : I think I got it. Default integer type literal could be int, long or long long depending on the value of the literal. In the case of unsigned long var1=4294967299*2; 4294967299 is considered as long long. 2 is promoted to long long and hence 4294967299*2 = 8589934598. Since LHS is unsigned long, truncation occurs and hence Result = 8589934598 - 4294967296 - 4294967296 = 6. Hope I am right. – Rajesh Jan 17 '18 at 13:32
  • 1
    @Rajesh You are entirely correct. On most architectures, `4294967299*2` is a well-defined expression, of type `long` or `long long` (`long` if `long` is 64-bit). On the other hand `2000000000*3` is an expression of type `int` that contains undefined behavior, because `2000000000` is typed as `int` and the multiplication overflows. It's a funny language. – Pascal Cuoq Jan 17 '18 at 17:44
21

Because numerical literals are of typicaly of type int. The UL/L tells the compiler that they are not of type int, e.g. assuming 32bit int and 64bit long

long i = 0xffff;
long j = 0xffffUL;

Here the values on the right must be converted to signed longs (32bit -> 64bit)

  1. The "0xffff", an int, would converted to a long using sign extension, resulting in a negative value (0xffffffff)
  2. The "0xffffUL", an unsigned long, would be converted to a long, resulting in a positive value (0x0000ffff)
codebauer
  • 245
  • 1
  • 5
  • 1
    Never thought about the printf. I work lots with arm, and have seen some 'interesting' vargs problems... – codebauer Oct 30 '12 at 09:35
  • 1
    I believe there is an example in there, but the details seem slightly off: 1) Hexadecimal constants are typed from another list that includes unsigned types 2) 0xffff is too small to set the sign bit on a 32-bit int 3) if a positive constant does not fit in a signed type without setting the sign bit, the next type in the list is tried. I tried to make a verified example, but I couldn't find the right constants. – Pascal Cuoq Oct 30 '12 at 09:50
  • 2
    @PascalCuoq Disagree that this is a _nice_ example. with C99, `0xffff` is an `int` with the value of 65,535. Assigning that to `i` is not an issue. `0xffffUL` is an `unsigned long` with the value of 65,535. Assigning that to `j` is also not an issue. Had this example been `long i = 0xffffffff;` `0xffffffff` is an `unsigned` with the value of 4,294,967,295 and assigning that to 64-`long` is not an issue. Also a non-issue with `long j = 0xffffffffUL;` This answer's #1 "converted to a long using sign extension" is not true here. – chux - Reinstate Monica Oct 26 '16 at 15:02
  • @chux You should argue with someone who said it were then. Why did you mention me? Please stop. – Pascal Cuoq Oct 28 '16 at 12:49
  • To answer your [querry](http://stackoverflow.com/questions/13134956/what-is-the-reason-for-explicitly-declaring-l-or-ul-for-long-values/13135343?noredirect=1#comment67867980_13135343), I mentioned you in [this comment](http://stackoverflow.com/questions/13134956/what-is-the-reason-for-explicitly-declaring-l-or-ul-for-long-values/13135343?noredirect=1#comment67791889_13135343) in response to [your comment](http://stackoverflow.com/questions/13134956/what-is-the-reason-for-explicitly-declaring-l-or-ul-for-long-values/13135343?noredirect=1#comment17864303_13135343) – chux - Reinstate Monica Oct 28 '16 at 13:31
  • @chux “Hey, in a 2012 discussion you said that an example of weird C behavior was nice at 8:42 before relenting and explaining why the example doesn't work at 9:50. Please let me explain to you now in 2016 at great length all the things that you have already shown you understand four years ago” “No thanks” – Pascal Cuoq Oct 30 '16 at 20:49
13

The question is why is should we use L/UL in long constants even after declaring it to be a long.

Because it's not "after"; it's "before".

First you have the literal, then it is converted to whatever the type is of the variable you're trying to squeeze it into.

They are two objects. The type of the target is designated by the unsigned long keywords, as you've said. The type of the source is designated by this suffix because that's the only way to specify the type of a literal.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
2

Related to this post is why a u.

A reason for u is to allow an integer constant greater than LLONG_MAX in decimal form.

// Likely to generate a warning.
unsigned long long limit63bit = 18446744073709551615; // 2^64 - 1

// OK
unsigned long long limit63bit = 18446744073709551615u;
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256