8
#include <stdio.h>

int arr[] = {1,2,3,4,5,6,7,8};
#define SIZE (sizeof(arr)/sizeof(int))

int main()
{
        printf("SIZE = %d\n", SIZE);
        if ((-1) < SIZE)
                printf("less");
        else
                printf("more");
}

The output after compiling with gcc is "more". Why the if condition fails even when -1 < 8?

Praetorian
  • 106,671
  • 19
  • 240
  • 328
manav m-n
  • 11,136
  • 23
  • 74
  • 97

6 Answers6

18

The problem is in your comparison:

    if ((-1) < SIZE)

sizeof typically returns an unsigned long, so SIZE will be unsigned long, whereas -1 is just an int. The rules for promotion in C and related languages mean that -1 will be converted to size_t before the comparison, so -1 will become a very large positive value (the maximum value of an unsigned long).

One way to fix this is to change the comparison to:

    if (-1 < (long long)SIZE)

although it's actually a pointless comparison, since an unsigned value will always be >= 0 by definition, and the compiler may well warn you about this.

As subsequently noted by @Nobilis, you should always enable compiler warnings and take notice of them: if you had compiled with e.g. gcc -Wall ... the compiler would have warned you of your bug.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • @Dieter Lücking: why, specifically, and why the down-vote ? – Paul R Aug 15 '13 at 07:21
  • 1
    @DieterLücking This answer is perfectly correct, as its not only signed/ unsigned problem both datatype of `-1` and `sizeof` returned values are different. In comparison expression `-1`'s type promoted to `unsigned long int` that is why in if condition SIZE is compared with `2,147,483,648UL` (I am consider 4 byte int and 8 byte long) but not `-1`. – Grijesh Chauhan Aug 15 '13 at 07:30
  • Now it's better. But I would never do "if (-1 < (long long)SIZE)" in an unsigned context –  Aug 15 '13 at 07:45
  • @Dieter: perhaps you could explain why, and provide a better solution ? signed versus unsigned is often tricky, and sometimes there isn't a perfect solution, but if you have one I'd be interested to see it. – Paul R Aug 15 '13 at 07:47
  • 2
    @DieterLücking Why you don't prefer a **better suggestion** over `if ((-1) < (int)sizeof(x))` **?** – Grijesh Chauhan Aug 15 '13 at 07:53
  • 1
    The better solution is using unsigned for size_types (as the std does). Hence if(-1 < (signed)size) is always true - ignore huge unsigned numbers becoming negative in the cast. –  Aug 15 '13 at 08:11
  • 1
    @Dieter: the big problem with this is "ignore huge unsigned numbers becoming negative in the cast" - it's not uncommon to have arrays bigger than 4 GB these days, so you fail badly on systems where int is 32 bits and sizeof is 64 bits. – Paul R Aug 15 '13 at 08:15
  • @PaulR The statement is: A unsigned number is never negative. –  Aug 15 '13 at 08:22
  • See also: 23.2 Container requirements –  Aug 15 '13 at 08:23
  • @DieterLücking: an `unsigned` in C/C++ is not a number, but a member of `ℤ/n` – 6502 Aug 15 '13 at 08:30
  • 2
    @PaulR Actually I should remove my down vote: "although it's actually a pointless comparison, since an unsigned value will always be >= 0 by definition" –  Aug 15 '13 at 08:31
  • @PaulR Downvoted until you correct: "int will be promoted to unsigned long". NO, *promotions* only apply to `char` and `short` to `int`. In this case, `int` will be *converted* to `size_t`. – TemplateRex Aug 15 '13 at 09:04
  • @TemplateRex: thanks - point noted and answer updated. – Paul R Aug 15 '13 at 09:07
  • 1
    @PaulR +1 now (and tnx for the quick repsonse) – TemplateRex Aug 15 '13 at 09:10
  • @PaulR BTW, I see on your profile that you are a C expert, does my answer also apply to C? (don't have a copy of the C Standard ready). – TemplateRex Aug 15 '13 at 09:12
  • 1
    @Dieter: I guess there are two separate issues: (i) fixing the immediate problem in the OP's code and (ii) the more fundamental problems of signed/unsigned comparisons and pointless comparisons of unsigned values. – Paul R Aug 15 '13 at 09:16
  • @TemplateRex: I don't know about "expert" - I've been writing C code for about 30 years but I don't think of myself as an expert yet. ;-) I *believe* the promotion/conversion rules for integer types are much the same in C and C++ but it's been a long time since I last studied the standards. – Paul R Aug 15 '13 at 09:26
  • 1
    @PaulR Gold C + C++ badges qualifies as expert in my book :-) – TemplateRex Aug 15 '13 at 09:29
  • 1
    "_if you had compiled with e.g. `gcc -Wall ...` the compiler would have warned you of your bug._" -- GCC warns with `-Wextra`. – Spikatrix Jun 21 '15 at 12:53
10

TL;DR

Be careful with mixed signed/unsigned operations (use -Wall compiler warnings). The Standard has a long section about it. In particular, it is often but not always true that signed is value-converted to unsigned (although it does in your particular example). See this explanation below (taken from this Q&A)

Relevant quote from the C++ Standard:

5 Expressions [expr]

10 Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

[2 clauses about equal types or types of equal sign omitted]

— Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to the type of the operand with unsigned integer type.

— Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.

— Otherwise, both operands shall be converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Your actual example

To see into which of the 3 cases your program falls, modify it slightly to this

#include <stdio.h>

int arr[] = {1,2,3,4,5,6,7,8};
#define SIZE (sizeof(arr)/sizeof(int))

int main()
{
        printf("SIZE = %zu, sizeof(-1) = %zu,  sizeof(SIZE) = %zu \n", SIZE, sizeof(-1), sizeof(SIZE));
        if ((-1) < SIZE)
                printf("less");
        else
                printf("more");
}

On the Coliru online compiler, this prints 4 and 8 for the sizeof() of -1 and SIZE, respectively, and selects the "more" branch (live example).

The reason is that the unsigned type is of greater rank than the signed type. Hence, clause 1 applies and the signed type is value-converted to the unsigned type (on most implementation, typically by preserving the bit-representation, so wrapping around to a very large unsigned number), and the comparison then proceeds to select the "more" branch.

Variations on a theme

Rewriting the condition to if ((long long)(-1) < (unsigned)SIZE) would take the "less" branch (live example).

The reason is that the signed type is of greater rank than the unsigned type and can also accomodate all the unsigned values. Hence, clause 2 applies and the unsigned type is converted to the signed type, and the comparison then proceeds to select the "less" branch.

Of course, you would never write such a contrived if() statement with explicit casts, but the same effect could happen if you compare variables with types long long and unsigned. So it illustrates the point that mixed signed/unsigned arithmetic is very subtle and depends on the relative sizes ("ranking" in the words of the Standard). In particular, there is no fixed rules saying that signed will always be converted to unsigned.

Community
  • 1
  • 1
TemplateRex
  • 69,038
  • 19
  • 164
  • 304
7

When you do comparison between signed and unsigned where unsigned has at least an equal rank to that of the signed type (see TemplateRex's answer for the exact rules), the signed is converted to the type of the unsigned.

With regards to your case, on a 32bit machine the binary representation of -1 as unsigned is 4294967295. So in effect you are comparing if 4294967295 is smaller than 8 (it isn't).

If you had enabled warnings, you would have been warned by the compiler that something fishy is going on:

warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Since the discussion has shifted a bit on how appropriate the use of unsigned is, let me put a quote by James Gosling with regards to the lack of unsigned types in Java (and I will shamelessly link to another post of mine on the subject):

Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.

Community
  • 1
  • 1
Nobilis
  • 7,310
  • 1
  • 33
  • 67
  • I didn't downvote any of the answers here, unless something is blatantly wrong, it is pointed out by more than one poster and the person answering the question has not corrected it I won't downvote it. If you don't believe me, then you can check my reputation for the day, if you downvote an answer you lose reputation. – Nobilis Aug 15 '13 at 07:54
  • 2
    The number of idiots that just downvote without giving an explanation comment is quite high on SO. – 6502 Aug 15 '13 at 08:25
  • @6502 I concur, I think if you downvote an answer you should lose more than just 1 rep, unless you've added a comment following your downvote where you've clearly indicated (by preceding it with '-1' or something) that you're the one downvoting and an explanation is provided (a lot of people do this which is great). If SO can trim greetings at the beginning of a question, I am sure it can handle this is sort of parsing wizardry too :) – Nobilis Aug 15 '13 at 08:31
  • 1
    Downvoted with the following reason: signed is not *always* value-converted to unsigned, see my answer, even though it is in this particular case (because `size_t` is at least of the same rank as `int`). – TemplateRex Aug 15 '13 at 09:07
  • Yes, I should have clarified that they need to be of the same rank, otherwise the superiour one will eclipse the other one (e.g. the presence of `long long signed` will convert to signed an `unsigned` variable). – Nobilis Aug 15 '13 at 10:27
7

This is an historical design bug of C that was also repeated in C++.

It dates back to 16-bit computers and the error was deciding to use all 16 bits to represent sizes up to 65536 giving up the possibility to represent negative sizes.

This in se wouldn't have been an error if unsigned meaning was "non-negative integer" (a size cannot logically be negative) but it's a problem with the conversion rules of the language.

Given the conversion rules of the language the unsigned type in C doesn't represent a non-negative number, but it's instead more like a bitmask (the mathematical term is actually "a member of the ℤ/n ring"). To see why consider that for the C and C++ language

  • unsigned - unsigned gives an unsigned result
  • signed + unsigned gives and unsigned result

both of them clearly make no sense at all if you read unsigned as "non-negative number".

Of course saying that the size of an object is a member of ℤ/n ring doesn't make any sense at all and here it's where the error resides.

Practical implications:

Every time you deal with the size of an object be careful because the value is unsigned and that type in C/C++ has a lot of properties that are illogical for a number. Please always remember that unsigned doesn't mean "non-negative integer" but "member of ℤ/n algebraic ring" and that, most dangerous, in case of a mixed operation an int is converted to unsigned int and not the opposite.

For example:

void drawPolyline(const std::vector<P2d>& pts) {
    for (int i=0; i<pts.size()-1; i++) {
        drawLine(pts[i], pts[i+1]);
    }
}

is buggy, because if passed an empty vector of points it will do illegal (UB) operations. The reason is that pts.size() is an unsigned.

The rules of the language will convert 1 (an integer) to 1{mod n}, will perform the subtraction in ℤ/n resulting in (size-1){mod n}, will convert i also to a {mod n} representation and will do the comparison in ℤ/n.

C/C++ actually defines a < operator in ℤ/n (rarely done in math) and you will end up accessing pts[0], pts[1] ... and so on until huge numbers even if the input vector was empty.

A correct loop could be

void drawPolyline(const std::vector<P2d>& pts) {
    for (int i=1; i<pts.size(); i++) {
        drawLine(pts[i-1], pts[i]);
    }
}

but I normally prefer

void drawPolyline(const std::vector<P2d>& pts) {
    for (int i=0,n=pts.size(); i<n-1; i++) {
        drawLine(pts[i], pts[i+1]);
    }
}

in other words getting rid of unsigned as soon as possible, and just working with regular ints.

Never use unsigned to represent size of containers or counters because unsigned means "member of ℤ/n" and the size of a container is not one of those things. Unsigned types are useful, but NOT to represent size of objects.

The standard C/C++ library unfortunately made this wrong choice, and it's too late to fix it. You are not forced to do the same mistake however.

In the words of Bjarne Stroustrup:

Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules

Community
  • 1
  • 1
6502
  • 112,025
  • 15
  • 165
  • 265
  • +1 Nice explanation, I wasn't aware of the historical precedent behind the `unsigned/signed` representation. – Nobilis Aug 15 '13 at 08:35
2

well, i'm not going to repeat the strong words Paul R said, but when you are comparing unsigned and integers you are going to experience dome bad things.

do if ((-1) < (int)SIZE)

instead of your if condition

No Idea For Name
  • 11,411
  • 10
  • 42
  • 70
0

Convert the unsigned type returned from sizeof operator to signed

when you compare two unsigned and signed number compiler implicitly converts signed to unsigned.
-1 signed representation in 4 byte int is 11111111 11111111 11111111 11111111 when converted to unsigned this representation would refer to 2^16-1
So basically your are comparing that 2^16-1>SIZE, which would be true.
You have to override that by explicitly casting the unsigned value to signed. Since sizeof operator returns unsigned long long you should cast it to signed long long

if((-1)<(signed long long)SIZE)

use this if condition in your code

Himanshu Pandey
  • 726
  • 5
  • 13