54

I always use unsigned int for values that should never be negative. But today I noticed this situation in my code:

void CreateRequestHeader( unsigned bitsAvailable, unsigned mandatoryDataSize, 
    unsigned optionalDataSize )
{
    If ( bitsAvailable – mandatoryDataSize >= optionalDataSize ) {
        // Optional data fits, so add it to the header.
    }

    // BUG! The above includes the optional part even if
    // mandatoryDataSize > bitsAvailable.
}

Should I start using int instead of unsigned int for numbers, even if they can't be negative?

Steve Hanov
  • 11,316
  • 16
  • 62
  • 69
  • 14
    What's wrong with: if (bitsAvailable >= optionalDataSize + mandatoryDataSize) { ... } ? – Russell Borogove Jul 15 '10 at 19:55
  • 1
    FYI, Java does not support unsigned types, so if you ever plan to have your code interop with Java, you should avoid those types unless you actually need the range of the type for specific values. I dont believe it is appropriate to use unsigned solely for the purposes of indicating that negative values are not supported/allowed. – David Jul 15 '10 at 20:04
  • 6
    Another FYI: These sort of bugs are the kind that a good static code analyzer will find for you. Coverity will find problems just like this one, haven't used any others enough to say, but I'm sure most of them would catch that. Here's a list of available tools: http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis – Mattias Nilsson Jul 15 '10 at 20:12
  • See also: http://stackoverflow.com/questions/1951519/when-to-use-stdsize-t for C++, but answers still mostly apply – BlueRaja - Danny Pflughoeft Jul 16 '10 at 16:59
  • 3
    @Russell, It's not perfect either. Addition may cause overflow and wrap unsigned. – Nyan Jul 17 '10 at 13:24

16 Answers16

129

One thing that hasn't been mentioned is that interchanging signed/unsigned numbers can lead to security bugs. This is a big issue, since many of the functions in the standard C-library take/return unsigned numbers (fread, memcpy, malloc etc. all take size_t parameters)

For instance, take the following innocuous example (from real code):

//Copy a user-defined structure into a buffer and process it
char* processNext(char* data, short length)
{
    char buffer[512];
    if (length <= 512) {
        memcpy(buffer, data, length);
        process(buffer);
        return data + length;
    } else {
        return -1;
    }
}

Looks harmless, right? The problem is that length is signed, but is converted to unsigned when passed to memcpy. Thus setting length to SHRT_MIN will validate the <= 512 test, but cause memcpy to copy more than 512 bytes to the buffer - this allows an attacker to overwrite the function return address on the stack and (after a bit of work) take over your computer!

You may naively be saying, "It's so obvious that length needs to be size_t or checked to be >= 0, I could never make that mistake". Except, I guarantee that if you've ever written anything non-trivial, you have. So have the authors of Windows, Linux, BSD, Solaris, Firefox, OpenSSL, Safari, MS Paint, Internet Explorer, Google Picasa, Opera, Flash, Open Office, Subversion, Apache, Python, PHP, Pidgin, Gimp, ... on and on and on ... - and these are all bright people whose job is knowing security.

In short, always use size_t for sizes.

Man, programming is hard.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
  • 15
    No, **forgetting bounds checking** results in security bugs. If you got it wrong in the other direction, `unsigned` wouldn't help you, your function would happily write to `myArray[0xFFFFFFFF]`. – dan04 Jul 16 '10 at 01:24
  • 15
    @dan04: No, **the root cause is using signed ints when you should be using unsigned ints** like `size_t` *(or, more precisely, it's the implicit conversion between signed/unsigned numbers)*. Of course, forgetting to check bounds is also a problem. I've changed the example to make this more clear - thanks. – BlueRaja - Danny Pflughoeft Jul 16 '10 at 16:25
  • 1
    This problem has bitten me in the ass so many times, that now I'm just casting everything to signed integers (yes, signed). 2 gigabytes of address space is not worth the trouble :) – AareP Aug 20 '11 at 19:34
  • @AareP: There is no such thing as a signed pointer, only a pointer to a signed type, so you never lose any address space. But even then, your solution does not help in the case above, because the variable is implicitly cast to `unsigned` when you call `memcpy()`. – BlueRaja - Danny Pflughoeft Feb 09 '12 at 17:35
  • What type should I use when doing subtraction with a `size_t`? – szx Jun 06 '13 at 07:50
  • 1
    @szx: http://stackoverflow.com/questions/14202241/what-type-for-subtracting-2-size-ts – BlueRaja - Danny Pflughoeft Jun 06 '13 at 14:18
  • 9
    I still don't see why omitting bounds checking for the lower bound of `length` is not the root problem here. Sure, you could have used an unsigned type like `size_t`, but then you wouldn't even be able to check that the lower bound is non-negative. Thanks to the implicit conversion rules, that just causes *different* bugs. How is that an improvement? – Cody Gray - on strike Jul 21 '13 at 11:11
  • 2
    @dan04: No, it will not write. If `unsigned int length = 0xFFFFFFFF` is used, then `if (length <= 512)` will evaluate to `false`. – Adam Apr 17 '14 at 10:19
  • you don't always want to put security all over the place. because security = defensive programming. and being too defensive has been in the spotlight in the code-smell categories as of late. I don't take side, simply state that both needs to really think about each other one day, and try to say something consistent for a change. – v.oddou Dec 12 '14 at 06:39
  • 1
    You should mention that in gcc the flag `-Wconversion` or more specifically `-Wsign-conversion` warns about the implicit sign conversion and the error becomes instantly visible. The problem is the hidden conversion. – Arne Jun 14 '17 at 11:41
  • @CodyGray What do you mean? If `size_t length` is non-negative as defined by the type-system, then surely there's no reason to check that it's non-negative? FWIW, `memcpy` accepts a `size_t`, so there's no implicit conversion. (And you can raise warnings for those, anyways.) – Mateen Ulhaq Aug 02 '19 at 06:16
  • @MateenUlhaq: In that code, `length` is a `short`, but gets implicitly converted to `size_t` when passed to `memcpy`. The negative number gets converted to a value `> 512`, which results in a buffer overflow. You can have the compiler raise a warning for exactly this reason, but that's a non-sequitur to the question. Besides, the mistake still slips by often, as seen in all of my examples. – BlueRaja - Danny Pflughoeft Aug 02 '19 at 16:45
30

Should I always ...

The answer to "Should I always ..." is almost certainly 'no', there are a lot of factors that dictate whether you should use a datatype- consistency is important.

But, this is a highly subjective question, it's really easy to mess up unsigneds:

for (unsigned int i = 10; i >= 0; i--);

results in an infinite loop.

This is why some style guides including Google's C++ Style Guide discourage unsigned data types.

In my personal opinion, I haven't run into many bugs caused by these problems with unsigned data types — I'd say use assertions to check your code and use them judiciously (and less when you're performing arithmetic).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Stephen
  • 47,994
  • 7
  • 61
  • 70
  • 1
    IMHO, `unsigned` helps catch errors during compilation phase rather than run-time. Ordinal values, such as quantities, should be `unsigned int` rather than `signed int`. – Thomas Matthews Jul 15 '10 at 20:25
  • 12
    Undetected underflow and overflow are basic C-family gotchas - using signed vs. unsigned changes the error cases, but doesn't get rid of any. Of course having an error case right adjacent to zero can be a *particularly* bad thing, but as you say, it depends what you're doing. In the loop above, you could check for `!= ~0` as your end condition - it's a useful unsigned invalid/end value. It's a slight cheat (`0` is int, so `~0` is `-1`) but on sane machines the implicit cast just works, and visually it's less wierd than having an unsigned `-1`. –  Jul 15 '10 at 22:36
  • 3
    @Thomas : Thanks, for the feedback, but I'm not entirely sure I agree. c (and c++) provides implicit conversions between `signed` and `unsigned` types, which can yield silent and surprising results. There aren't too many syntactic constraints between the two that can trigger a compilation failure (unless you pass additional compiler warning flags). The benefit of an `unsigned` type is mostly semantic, unless you're specifically using the unsigned type to avoid manipulation of the sign bit (e.g. in a bitmask). – Stephen Jul 15 '10 at 22:51
  • @Steve314 : Yep, there are certainly ways to avoid this - but they aren't as intuitive to read as `>=0`... which is why it became a 'gotcha' :) – Stephen Jul 15 '10 at 22:55
  • `for (unsigned i = 10; i != -1; --i)` is perfectly fine. – Alexandre C. Jul 16 '10 at 16:26
  • 1
    Bad Things™ happen when you use signed numbers to represent size-parameters. See my post. – BlueRaja - Danny Pflughoeft Jul 16 '10 at 16:49
  • I don't think that description of Google's style guide is 100% accurate. Here's what it actually says: ["Of the C integer types, only `int` should be used. When appropriate, you are welcome to use standard types like `size_t` and `ptrdiff_t`."](https://google-styleguide.googlecode.com/svn/trunk/cppguide.html#Integer_Types) and then it goes on to elaborate some cases in which you should use `int` instead of one of the unsigned types (like `uint32_t`). Perhaps Google's style guide has changed since this answer was written? – D.W. Mar 30 '15 at 17:01
  • @D.W. [As of 21 March 2016](https://web.archive.org/web/20160321095337/https://google.github.io/styleguide/cppguide.html#Integer_Types), their suggestion seems pretty unequivocal to me: “You should not use the unsigned integer types such as `uint32_t`, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.” – Lawrence Velázquez Apr 30 '16 at 19:38
15

Some cases where you should use unsigned integer types are:

  • You need to treat a datum as a pure binary representation.
  • You need the semantics of modulo arithmetic you get with unsigned numbers.
  • You have to interface with code that uses unsigned types (e.g. standard library routines that accept/return size_t values.

But for general arithmetic, the thing is, when you say that something "can't be negative," that does not necessarily mean you should use an unsigned type. Because you can put a negative value in an unsigned, it's just that it will become a really large value when you go to get it out. So, if you mean that negative values are forbidden, such as for a basic square root function, then you are stating a precondition of the function, and you should assert. And you can't assert that what cannot be, is; you need a way to hold out-of-band values so you can test for them (this is the same sort of logic behind getchar() returning an int and not char.)

Additionally, the choice of signed-vs.-unsigned can have practical repercussions on performance, as well. Take a look at the (contrived) code below:

#include <stdbool.h>

bool foo_i(int a) {
    return (a + 69) > a;
}

bool foo_u(unsigned int a)
{
    return (a + 69u) > a;
}

Both foo's are the same except for the type of their parameter. But, when compiled with c99 -fomit-frame-pointer -O2 -S, you get:

        .file   "try.c"
        .text
        .p2align 4,,15
.globl foo_i
        .type   foo_i, @function
foo_i:
        movl    $1, %eax
        ret
        .size   foo_i, .-foo_i
        .p2align 4,,15
.globl foo_u
        .type   foo_u, @function
foo_u:
        movl    4(%esp), %eax
        leal    69(%eax), %edx
        cmpl    %eax, %edx
        seta    %al
        ret
        .size   foo_u, .-foo_u
        .ident  "GCC: (Debian 4.4.4-7) 4.4.4"
        .section        .note.GNU-stack,"",@progbits

You can see that foo_i() is more efficient than foo_u(). This is because unsigned arithmetic overflow is defined by the standard to "wrap around," so (a + 69u) may very well be smaller than a if a is very large, and thus there must be code for this case. On the other hand, signed arithmetic overflow is undefined, so GCC will go ahead and assume signed arithmetic doesn't overflow, and so (a + 69) can't ever be less than a. Choosing unsigned types indiscriminately can therefore unnecessarily impact performance.

Nietzche-jou
  • 14,415
  • 4
  • 34
  • 45
12

The answer is Yes. The "unsigned" int type of C and C++ is not an "always positive integer", no matter what the name of the type looks like. The behavior of C/C++ unsigned ints has no sense if you try to read the type as "non-negative"... for example:

  • The difference of two unsigned is an unsigned number (makes no sense if you read it as "The difference between two non-negative numbers is non-negative")
  • The addition of an int and an unsigned int is unsigned
  • There is an implicit conversion from int to unsigned int (if you read unsigned as "non-negative" it's the opposite conversion that would make sense)
  • If you declare a function accepting an unsigned parameter when someone passes a negative int you simply get that implicitly converted to a huge positive value; in other words using an unsigned parameter type doesn't help you finding errors neither at compile time nor at runtime.

Indeed unsigned numbers are very useful for certain cases because they are elements of the ring "integers-modulo-N" with N being a power of two. Unsigned ints are useful when you want to use that modulo-n arithmetic, or as bitmasks; they are NOT useful as quantities.

Unfortunately in C and C++ unsigned were also used to represent non-negative quantities to be able to use all 16 bits when the integers where that small... at that time being able to use 32k or 64k was considered a big difference. I'd classify it basically as an historical accident... you shouldn't try to read a logic in it because there was no logic.

By the way in my opinion that was a mistake... if 32k are not enough then quite soon 64k won't be enough either; abusing the modulo integer just because of one extra bit in my opinion was a cost too high to pay. Of course it would have been reasonable to do if a proper non-negative type was present or defined... but the unsigned semantic is just wrong for using it as non-negative.

Sometimes you may find who says that unsigned is good because it "documents" that you only want non-negative values... however that documentation is of any value only for people that don't actually know how unsigned works for C or C++. For me seeing an unsigned type used for non-negative values simply means that who wrote the code didn't understand the language on that part.

If you really understand and want the "wrapping" behavior of unsigned ints then they're the right choice (for example I almost always use "unsigned char" when I'm handling bytes); if you're not going to use the wrapping behavior (and that behavior is just going to be a problem for you like in the case of the difference you shown) then this is a clear indicator that the unsigned type is a poor choice and you should stick with plain ints.

Does this means that C++ std::vector<>::size() return type is a bad choice ? Yes... it's a mistake. But if you say so be prepared to be called bad names by who doesn't understand that the "unsigned" name is just a name... what it counts is the behavior and that is a "modulo-n" behavior (and no one would consider a "modulo-n" type for the size of a container a sensible choice).

6502
  • 112,025
  • 15
  • 165
  • 265
  • 5
    -1. Er, I mean +4294967295 :) `unsigned`'s semantics are illogical. – dan04 Jul 16 '10 at 01:34
  • 2
    @dan04: The problem with unsigned integers is that they get used for two different purposes, each of which could have a sensible set of rules, but C has a mish-mosh of rules from those two purposes. Numeric types which wrap around are very useful for some things. When processing TCP packets, for example, it's very useful to be able to say `tcp->stuffed - tcp->acked` and know how many bytes have been stuffed into the buffer but not acknowledged even if the sequence numbers have wrapped around. The problem is that unsigned values don't have consistent wrapping semantics... – supercat Apr 21 '15 at 16:42
  • ...because they're often used to hold values which will never be negative, but are too large to fit in the same-sized unsigned type. The wrapping behaviors of unsigned types weren't so much designed into them as they occurred naturally in early systems and were useful. – supercat Apr 21 '15 at 16:44
  • It was pretty common on many systems with a 16-bit `int` type to have individual objects which were bigger than 32K, but efficiently handling objects larger than 64K would have required a larger `int` type. The problem with `unsigned int` is that as you quite rightly note it is used to serve two disjoint roles (numbers versus algebraic rings). I wish C would add new separate types for natural numbers up to 2^2^n-1 [e.g. 65535], natural numbers up to 2^(2^n-1)-1 [e.g. 32767], and algebraic rings mod 2^2^n [e.g. 65536], with semantics that were better in each case. – supercat Apr 22 '15 at 15:49
11

Bjarne Stroustrup, creator of C++, warns about using unsigned types in his book The C++ programming language:

The unsigned integer types are ideal for uses that treat storage as a bit array. Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules.

5ound
  • 1,179
  • 6
  • 9
  • Yet the standard library used unsigned types for container size (a major source of bugs in C++ programs)... – 6502 Jul 15 '10 at 21:26
  • @6502 I would interface with the standard containers using iterators for almost every task except the most trivial or throw-away snippets. – Khaled Alshaya Jul 16 '10 at 00:18
  • To be more explicit: he does **not** warn about *in general*! He only warns about trying to extend the value range by using unsigned instead of signed! – Aconcagua Nov 03 '17 at 10:38
7

I seem to be in disagreement with most people here, but I find unsigned types quite useful, but not in their raw historic form.

If you consequently stick to the semantic that a type represents for you, then there should be no problem: use size_t (unsigned) for array indices, data offsets etc. off_t (signed) for file offsets. Use ptrdiff_t (signed) for differences of pointers. Use uint8_t for small unsigned integers and int8_t for signed ones. And you avoid at least 80% of portability problems.

And don't use int, long, unsigned, char if you mustn't. They belong in the history books. (Sometimes you must, error returns, bit fields, e.g)

And to come back to your example:

bitsAvailable – mandatoryDataSize >= optionalDataSize

can be easily rewritten as

bitsAvailable >= optionalDataSize + mandatoryDataSize

which doesn't avoid the problem of a potential overflow (assert is your friend) but gets you a bit nearer to the idea of what you want to test, I think.

Aconcagua
  • 24,880
  • 4
  • 34
  • 59
Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • I like this: It's a good idea to avoid subtraction if you are using unsigned types. – Steve Hanov Jul 16 '10 at 16:13
  • 1
    On a 32-bit system, given `uint16_t x = 0xFFFF; uint16_t y=x*x;` what does the standard say about the value of `y`? – supercat Apr 21 '15 at 16:46
  • @supercat, I don't see what you are heading for, but the rules are simple. The RHS is computed in 32bit `int`. The result of the multiplication seems to be `0xFFFE0001` so the multiplication overflows and the behavior is undefined. This is a good example why one should never use narrow types for arithmetic. When using `size_t` this problem wouldn't occur. – Jens Gustedt Apr 22 '15 at 07:10
  • @JensGustedt: Until recently, the majority of embedded systems used a 16-bit `int`. If one was writing code on such a system to update a 16-bit two's-complement or one's-complement checksum after writing a repeated value to a stream, multiplying two `uint16_t` would be the natural way to do it. Further, until very recently, 99.9% of C compilers for 32-bit systems would yield exactly the same computation without any difficulty. While some would argue that the expression would be better written as `1u*x*x`, I regard the need for the latter form as a deficiency in the language spec. – supercat Apr 22 '15 at 15:04
  • @JensGustedt: Further, I don't think `size_t` and `ptrdiff_t` really help much; given `char foo[100], *p1 = foo, *p2 = foo+100;` what is the value of `(p1-p2) > sizeof foo`? On some systems where the largest item size is between 32768 and 65535, or between 2147483648 and 4294967295, the expression would yield 0, but on many other systems it would yield 1. – supercat Apr 22 '15 at 15:18
  • Do you mean `p2 - p1` ? `p2 - p1` is only defined if both pointers point inside (or one beyond) the same object. So by definition the value fits into `size_t`. Now if you really meant `p1-p2`, the result type is `ptrdiff_t` so this is a negative value. If it underflows, again the behavior is not defined. --- I am still not clear what you want to prove or disprove. The only thing that you have shown so far is some miscomfort with what I say in my answer. – Jens Gustedt Apr 22 '15 at 15:32
  • @supercat, also observe that this answer here is almost 5 years old. I have written that stuff up, now also quite a while ago https://gustedt.wordpress.com/2013/07/15/a-praise-of-size_t-and-other-unsigned-types/ – Jens Gustedt Apr 22 '15 at 15:38
  • @JensGustedt: How should one cleanly handle operations which e.g. combine size_t and ptrdiff_t? IMHO, to really be a portable language C should add some new kinds of integral types to distinguish types that should behave as natural numbers (which would compare greater than negative numbers of any size, and would not be required to "wrap" cleanly in any case) from those that should behave as wrapping algebraic rings (which, when added to a signed or natural number of any size, would yield a member of the same ring). That would allow improved optimizations in many kinds of code... – supercat Apr 22 '15 at 16:00
  • ...and allow things like `wrap16_t x=65535; wrap16_t y=x*x;` to be written with clearly-defined semantics, unlike `int16_t` which would require the latter expression to be written as 1u*x*x`. A lot of code today expects `uint32_t` will behave as a `wrap32_t` and would fail on a machine where `int` is 64 bits, but if a `wrap32_t` type existed such code could be ported easily to 64-bit systems by changing some `uint32_t` to `wrap32_t` and some to `nat32_t`; in most cases it should be fairly obvious which would be required. – supercat Apr 22 '15 at 16:04
  • 1
    @JensGustedt: BTW, I just looked through your blog about C defects. I think the biggest thing I'd say C is missing is a standard means by which a program can say to the compiler "This is what I require of my implementation; you should either give me what I require or refuse compilation". Presently, many compilers offer command-line switches to control whether `char` is signed or unsigned, whether things like integer overflow will behave fully predictably, somewhat predictably, or negate the laws of time and causality, etc. but there's no standard way for programs to specify requirements. – supercat Apr 22 '15 at 16:48
  • @JensGustedt: Compilers for ones-complement machines wouldn't have to generate code which interprets (-2 & 7) as equal to 6, but if given code that says "I expect two's-complement math" would have to either generate bitwise-and code which behaves in such fashion or refuse compilation. Likewise if a compiler for a two's-complement machine is given code which expects ones'-complement math. As it is, it's often impossible to write practical non-trivial programs which don't require some assumptions beyond those given in the C standard, but there's no machine-readable way to document them. – supercat Apr 22 '15 at 16:58
  • @supercat, this discussion is not leading far. And BTW it is easy to test at compile time if it is two's complement or not. – Jens Gustedt Apr 22 '15 at 22:09
6
if (bitsAvailable >= optionalDataSize + mandatoryDataSize) {
    // Optional data fits, so add it to the header.
}

Bug-free, so long as mandatoryDataSize + optionalDataSize can't overflow the unsigned integer type -- the naming of these variables leads me to believe this is likely to be the case.

Stephen Canon
  • 103,815
  • 19
  • 183
  • 269
6

You can't fully avoid unsigned types in portable code, because many typedefs in the standard library are unsigned (most notably size_t), and many functions return those (e.g. std::vector<>::size()).

That said, I generally prefer to stick to signed types wherever possible for the reasons you've outlined. It's not just the case you bring up - in case of mixed signed/unsigned arithmetic, the signed argument is quietly promoted to unsigned.

Pavel Minaev
  • 99,783
  • 25
  • 219
  • 289
3

From the comments on one of Eric Lipperts Blog Posts (See here):

Jeffrey L. Whitledge

I once developed a system in which negative values made no sense as a parameter, so rather than validating that the parameter values were non-negative, I thought it would be a great idea to just use uint instead. I quickly discovered that whenever I used those values for anything (like calling BCL methods), they had be converted to signed integers. This meant that I had to validate that the values didn't exceed the signed integer range on the top end, so I gained nothing. Also, every time the code was called, the ints that were being used (often received from BCL functions) had to be converted to uints. It didn't take long before I changed all those uints back to ints and took all that unnecessary casting out. I still have to validate that the numbers are not negative, but the code is much cleaner!

Eric Lippert

Couldn't have said it better myself. You almost never need the range of a uint, and they are not CLS-compliant. The standard way to represent a small integer is with "int", even if there are values in there that are out of range. A good rule of thumb: only use "uint" for situations where you are interoperating with unmanaged code that expects uints, or where the integer in question is clearly used as a set of bits, not a number. Always try to avoid it in public interfaces.

  • Eric
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Brian
  • 25,523
  • 18
  • 82
  • 173
  • That's in regards to C#, not C – BlueRaja - Danny Pflughoeft Jul 15 '10 at 23:40
  • @BlueRaja: The specific examples are C#-specific, but the general points the comments make are still quite true. – Brian Jul 16 '10 at 01:41
  • As I mention in my post, you **should** be using unsigned data types for APIs which require a size-parameter (use `size_t`). This is not the case in .Net, where buffer overflows are a non-issue. – BlueRaja - Danny Pflughoeft Jul 16 '10 at 16:48
  • @BlueRaja: The quote explicitly states that you should use unsigned data types when calling code that expects unsigned int. – Brian Jul 16 '10 at 17:35
  • 1
    I meant you should be using unsigned data types for your own APIs which require a size-parameter (in C), regardless of what you're calling. – BlueRaja - Danny Pflughoeft Jul 16 '10 at 17:57
  • @BlueRaja: I fail to see how that is not something which would be naturally concluded from the quote. Although you should try to avoid it in public interfaces, if the code your public interface is interacting with is expecting an unsigned data type, then you will need to use an unsigned value to interoperate with the code you are calling through that interface. – Brian Jul 16 '10 at 18:04
2

The situation where (bitsAvailable – mandatoryDataSize) produces an 'unexpected' result when the types are unsigned and bitsAvailable < mandatoryDataSize is a reason that sometimes signed types are used even when the data is expected to never be negative.

I think there's no hard and fast rule - I typically 'default' to using unsigned types for data that has no reason to be negative, but then you have to take to ensure that arithmetic wrapping doesn't expose bugs.

Then again, if you use signed types, you still have to sometimes consider overflow:

MAX_INT + 1

The key is that you have to take care when performing arithmetic for these kinds of bugs.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • The "wrapping" is the only interesting feature of unsigned ints (for regular ints you only have undefined behavior). If the wrapping is going to be a problem (or if you've to be careful to avoid it) then it's a clear sign that "unsigned" was the wrong choice. Using unsigned and having problems with the wrapping (that's the most distinctive feature of unsigned types) is nonsense... when you use unsigned you WANT the wrapping... you should choose unsigned BECAUSE of the wrapping behaviour... – 6502 Jul 15 '10 at 21:24
  • @6502: you make a really good point, and I honestly think that I sometimes use unsigned types when signed types might be a better choice. But I think there are also exceptions; for example, when dealing with file sizes you may need to be able to deal with the full range of `size_t` (or even some larger unsigned type), but you might still need to handle wrapping errors. – Michael Burr Jul 15 '10 at 21:55
2

No you should use the type that is right for your application. There is no golden rule. Sometimes on small microcontrollers it is for example more speedy and memory efficient to use say 8 or 16 bit variables wherever possible as that is often the native datapath size, but that is a very special case. I also recommend using stdint.h wherever possible. If you are using visual studio you can find BSD licensed versions.

user393170
  • 31
  • 1
1

If there is a possibility of overflow, then assign the values to the next highest data type during the calculation, ie:

void CreateRequestHeader( unsigned int bitsAvailable, unsigned int mandatoryDataSize, unsigned int optionalDataSize ) 
{ 
    signed __int64 available = bitsAvailable;
    signed __int64 mandatory = mandatoryDataSize;
    signed __int64 optional = optionalDataSize;

    if ( (mandatory + optional) <= available ) { 
        // Optional data fits, so add it to the header. 
    } 
} 

Otherwise, just check the values individually instead of calculating:

void CreateRequestHeader( unsigned int bitsAvailable, unsigned int mandatoryDataSize, unsigned int optionalDataSize ) 
{ 
    if ( bitsAvailable < mandatoryDataSize ) { 
        return;
    } 
    bitsAvailable -= mandatoryDataSize;

    if ( bitsAvailable < optionalDataSize ) { 
        return;
    } 
    bitsAvailable -= optionalDataSize;

    // Optional data fits, so add it to the header. 
} 
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
0

You'll need to look at the results of the operations you perform on the variables to check if you can get over/underflows - in your case, the result being potentially negative. In that case you are better off using the signed equivalents.

Timo Geusch
  • 24,095
  • 5
  • 52
  • 70
0

I don't know if its possible in c, but in this case I would just cast the X-Y thing to an int.

InsertNickHere
  • 3,616
  • 3
  • 26
  • 23
0

If your numbers should never be less than zero, but have a chance to be < 0, by all means use signed integers and sprinkle assertions or other runtime checks around. If you're actually working with 32-bit (or 64, or 16, depending on your target architecture) values where the most significant bit means something other than "-", you should only use unsigned variables to hold them. It's easier to detect integer overflows where a number that should always be positive is very negative than when it's zero, so if you don't need that bit, go with the signed ones.

nmichaels
  • 49,466
  • 12
  • 107
  • 135
0

Suppose you need to count from 1 to 50000. You can do that with a two-byte unsigned integer, but not with a two-byte signed integer (if space matters that much).

John
  • 15,990
  • 10
  • 70
  • 110