38

Is there any c++ standard paragraph which says that using -1 for this is portable and correct way or the only way of doing this correctly is using predefined values?

I have had a conversation with my colleague, what is better: using -1 for a maximum unsigned integer number or using a value from limits.h or std::numeric_limits ?

I have told my colleague that using predefined maximum values from limits.h or std::numeric_limits is the portable and clean way of doing this, however, the colleague objected to -1 being as same portable as numeric limits, and more, it has one more advantage:

unsigned short i = -1; // unsigned short max

can easily be changed to any other type, like

unsigned long i = -1; // unsigned long max

when using the predefined value from the limits.h header file or std::numeric_limits also requires to rewrite it too along with the type to the left.

VP.
  • 15,509
  • 17
  • 91
  • 161
  • 41
    Seeing `-` and `unsigned` on the same line is guaranteed to raise a few eyebrows. – Ron Dec 07 '17 at 15:32
  • 6
    You don't need to repeat yourself if you use `auto`. – Quentin Dec 07 '17 at 15:32
  • @Ron I agree, but that is the hack my colleague says about, like `-1` will be converted to `0 - 1` which will use overflow and set the maximum number of a type. – VP. Dec 07 '17 at 15:33
  • 8
    I voted to reopen because this question is not an exact duplicate of the purported original. The other question discusses the behavior of arithmetic when values exceed the range of an unsigned integer type. While this question involves that, it asks a different question about the semantics of using `-1`. – Eric Postpischil Dec 07 '17 at 15:37
  • I would guess that the real case of your problem is not in the question. ButI bet the real problem was "what is the best way to report an error when a function return an integer without exception". In this case I will answer use a boolean to indicate the error ! or new c++17 `std::optional`. – Stargateur Dec 07 '17 at 15:39
  • @EricPostpischil It's the OP who close as duplicate and the question is "Is `-1` correct for using as maximum value of an unsigned integer?", in SO two question in one is a bad practice and make the question too broad. – Stargateur Dec 07 '17 at 15:41
  • This might be a better duplicate, although I'm hesitant to hammer it: https://stackoverflow.com/q/2273913/10077 – Fred Larson Dec 07 '17 at 15:41
  • 12
    Ron's comment is more than just a comment. While `-1` might be *technically* correct (see Eric's answer), from a clean code standpoint it isn't. Figuring out whether `-1` is an error here took you a question, and Eric a looking-up in the standard. `unsigned short i = USHRT_MAX` would require neither, and be more explicit about the statement's intended purpose. – DevSolar Dec 07 '17 at 15:42
  • @Stargateur: You can vote to close as too broad if you wish, or suggest another duplicate. This does not change the fact that the question, as stated (regardless of the intent of the OP), was not a duplicate of the purported original. – Eric Postpischil Dec 07 '17 at 15:44
  • 5
    The question may apply to multiple languages, yet the answer is not necessarily the same for C and C++. Selecting 1 language would reduce the unnecessary broadness of this question. – chux - Reinstate Monica Dec 07 '17 at 15:44
  • @chux: -1 will not survive a type change to `auto`. ;-) – DevSolar Dec 07 '17 at 15:50
  • 1
    What about One's Complement systems? – Christian Gibbons Dec 07 '17 at 15:51
  • @ChristianGibbons `some_unsigned_type x = -1;` will initialize `x` with the maximum value of the type regardless of `int` encoding (2's 1's ,SM) – chux - Reinstate Monica Dec 07 '17 at 16:00
  • 1
    "`unsigned short i = -1;` can easily be changed to any other type" to get the _maximum_ type should be narrowed to: "can easily be changed to any other *unsigned* type" – chux - Reinstate Monica Dec 07 '17 at 16:04
  • @chux so it's not that it takes the binary encoding of what `-1` would have been had it been a signed type? Interesting. – Christian Gibbons Dec 07 '17 at 16:05
  • @ChristianGibbons Yes. Conversion between numeric types (`int,unsigned,double,bool`, etc.) is primarily (maybe even solely) based on _value_, not _encoding_. – chux - Reinstate Monica Dec 07 '17 at 16:07
  • 2
    For the record, `auto i = std::numeric_limits::max();` states the type only exactly once. So the ease-of-editing concern doesn't apply to C++, I think. Comment not answer since that concern looks like preamble, not question. – Tommy Dec 07 '17 at 16:19
  • 2
    Is there anything non-portable about using `~0` instead? If so, is one conventionally preferred over the other? – Christian Gibbons Dec 07 '17 at 16:21
  • "what is better: using -1 for a maximum unsigned integer number or using a value from limits.h or std::numeric_limits" --> The consideration for what is better (best) should have been open to other solutions too. There are 3 types of people: those who always think in binary and those that do not. – chux - Reinstate Monica Dec 07 '17 at 16:59
  • 1
    Hmm, just of curiosity, I searched the source tree for Linux 4.15-rc1 for this. I saw 112 assignments of `-1` to unsigned something, 69 assignments of `U.*_MAX` to unsigned something, and 92 assignments of `~0` to an unsigned something. – ilkkachu Dec 07 '17 at 18:08
  • 1
    @ChristianGibbons The issue with `~0` is that it's not guaranteed to be equal to `-1` in all representations. – D Krueger Dec 07 '17 at 18:33
  • @DKrueger It is not -1 that we ultimately want. What we are after is the maximum value of an unsigned type. – Christian Gibbons Dec 07 '17 at 18:47
  • 3
    @ChristianGibbons Assigning `-1` to an unsigned variable results in the variable containing the maximum value because, due to the modulo arithmetic, you are effectively assigning `(UINT_MAX + 1) + (-1)`. Now consider a sign-and-magnitude representation where `~0` would be equal to `INT_MIN`. When assigned to an unsigned int, the value would effectively be `(UINT_MAX + 1) + INT_MIN`, which is not equal to `UINT_MAX`. – D Krueger Dec 07 '17 at 18:59
  • @DKrueger Oh, I see. Since it wasn't specified otherwise, the literal is being treated as a signed number rather than unsigned. So then to solve that issue, one would do `~0U` to avoid dealing with conversion from signed to unsigned? – Christian Gibbons Dec 07 '17 at 19:06
  • 3
    @ChristianGibbons `~0U` will work as long as the variable's type is not wider than an unsigned integer. – D Krueger Dec 07 '17 at 19:27
  • 1
    @ChristianGibbons I strongly recommend that if you're not going to use the constants defined in `limits.h` or C++'s `std::numeric_limits`, to just `-1` instead of `~0U`. To be clear, both are unintuitive and require low-level knowledge to understand and careful reasoning to prove correct. However, the conversion of a negative value into an unsigned type is trivially defined in every C standard in terms of their *values*, but the bitwise operators (other than the shifts) are defined (both conceptually and in the standard) as strictly in terms of the actual binary representation of the data. – mtraceur Dec 07 '17 at 21:23
  • 2
    @ChristianGibbons Also, I think the mental gymnastics needed to really correctly prove `~0U` is correct when you see it in the code are more involved than those required to prove that `-1` is correct. And I am concerned that for all but the few people who grok and internalize the C standards enough to do those mental gymnastics correctly, `~0U` is slightly more likely to mislead and produce misunderstanding about the details of *why* that construct works than `-1` is (and those subtle misunderstandings carry over to other subtly incorrect code). – mtraceur Dec 07 '17 at 21:32
  • the header file: `limits.h` contains the definition of `UINT_MAX` which is what your looking for. – user3629249 Dec 08 '17 at 05:31
  • I think it's worth pointing out that in C, you can't always know the type, thus you don't have a macro. For example with size_t in C89 where there is no SIZE_MAX. – pipe Dec 08 '17 at 09:28
  • Duplicate of [this question](https://stackoverflow.com/questions/2760502/question-about-c-behaviour-for-unsigned-integer-underflow), but also see [this one](https://stackoverflow.com/questions/8208023/converting-1-to-unsigned-types) and [this one](https://stackoverflow.com/questions/1863153/why-unsigned-int-0xffffffff-is-equal-to-int-1). – Nikos C. Dec 08 '17 at 09:45
  • @pipe: C89 is not standard C. The C tag implies standard C, which provides these macros since 18 years. And in C the type of an object is always know for any expression, so one can very well use it. Don't write code for ancient versions if you don't need to! – too honest for this site Dec 11 '17 at 13:22
  • @Olaf C does not have a way to interrogate the underlying type for types from `typedef`. And as an embedded developer, yes. I often need to. – pipe Dec 11 '17 at 13:46
  • @pipe: The type of an object is clearly determined by the expression. And if you need the type of the LHS of an assignment, just use the type of that object. In the >20 years of embedded development using C (and the time before that on other systems), I didn't have a problem with that. If you had, you might want to question your approach. Said that: there is a macro for every standard type and the aliases. (On a sidenote: C does not allow to define new scalar types and `typedef` never defines a new type - that's one difference to C++). – too honest for this site Dec 11 '17 at 14:00
  • @VictorPolevoy - please edit this question and make it **C++**-only, remove the **C** tag. Otherwise the question will be deleted as a dupe. – rustyx Dec 13 '17 at 16:07
  • @RustyX okay, but about the answers? – VP. Dec 13 '17 at 16:39
  • @RustyX done. I have voted to reopen. – VP. Dec 13 '17 at 19:17
  • The standard itself uses -1 as the value of the unsigned [`std::basic_string::npos`](https://en.cppreference.com/w/cpp/string/basic_string/npos). – interjay Dec 12 '18 at 15:23
  • @interjay Cool thing to know, thanks. By saying "standard", did you mean the gcc libstdc++ implementation? – VP. Dec 13 '18 at 10:52
  • I meant the C++ standard itself, see the definition of `npos` in [24.3.2p5](https://timsong-cpp.github.io/cppwp/n4659/basic.string#5.2). – interjay Dec 13 '18 at 11:25

5 Answers5

29

Regarding conversions of integers, C 2011 [draft N1570] 6.3.1.3 2 says

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

Thus, converting -1 to an unsigned integer type necessarily produces the maximum value of that type.

There may be issues with using -1 in various contexts where it is not immediately converted to the desired type. If it is immediately converted to the desired unsigned integer type, as by assignment or explicit conversion, then the result is clear. However, if it is a part of an expression, its type is int, and it behaves like an int until converted. In contrast, UINT_MAX has the type unsigned int, so it behaves like an unsigned int.

As chux points out in a comment, USHRT_MAX effectively has a type of int, so even the named limits are not fully safe from type issues.

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Concerning "`UINT_MAX` has the type `unsigned int` ..." does not address OP's case of `unsigned short`. This suggests `USHRT_MAX` whose type, AFIAK, is not specified. (C: Perhaps could be `int`, `unsigned`, `unsigned short`, ...). – chux - Reinstate Monica Dec 07 '17 at 16:45
  • 2
    @chux: That is a good point. The type is specified; these macros are “expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions,” but that means `USHRT_MAX` may be an `int`, so it may behave unexpectedly if you are expecting an unsigned type. – Eric Postpischil Dec 07 '17 at 16:59
  • It might be beneficial to reference other standards to prove this, but for the record, as far as I can recall, this behavior was well-defined for *all* C standards as of this writing. If I remember right, C99 has the same text. C89's standard text does not include this exact phrase but it is implied by the rules for integer promotions/conversions, as I understand it. And I think all C++ standards have been aligned with this behavior. – mtraceur Dec 07 '17 at 21:34
  • Thank you for your answer. However, there was a long discussion about the question, it seems it is better to change it to c++ only. I have upvoted your answer though and it helped me. – VP. Dec 13 '17 at 19:18
18

Not using the standard way or not clearly showing the intent is often a bad idea that we pay later

I would suggest:

auto i = std::numeric_limits<unsigned int>::max(); 

or @jamesdin suggested a certainly better one, closer to the C habits:

unsigned int i = std::numeric_limits<decltype(i)>::max(); 

Your colleague argument is not admissible. Changing int -> long int, as bellow:

auto i = std::numeric_limits<unsigned long int>::max(); 
  • does not require extra work compared to the -1 solution (thanks to the use of auto).
  • the '-1' solution does not directly reflect our intent, hence it possibly has harmful consequences. Consider this code snippet:

.

using index_t = unsigned int;

... now in another file (or far away from the previous line) ...

const index_t max_index = -1;

First, we do not understand why max_index is -1. Worst, if someone wants to improve the code and define

 using index_t = ptrdiff_t;

=> then the statement max_index=-1 is not the max anymore and you get a buggy code. Again this can not happen with something like:

const index_t max_index = std::numeric_limits<index_t>::max();

CAVEAT: nevertheless there is a caveat when using std::numeric_limits. It has nothing to do with integers, but is related to floating point numbers.

std::cout << "\ndouble lowest: "
          << std::numeric_limits<double>::lowest()
          << "\ndouble min   : "
          << std::numeric_limits<double>::min() << '\n';

prints:

double lowest: -1.79769e+308    
double min   :  2.22507e-308  <-- maybe you expected -1.79769e+308 here!
  • min returns the smallest finite value of the given type
  • lowest returns the lowest finite value of the given type

Always interesting to remember that, as it can be a source of bug if we do not pay attention to (using min instead of lowest).

Picaud Vincent
  • 10,518
  • 5
  • 31
  • 70
  • 1
    `2.22507e-308` does not look like the "smallest finite value of the given type". I'd expect `4.940656e-324`. Perhaps it is the smallest [normal](https://en.wikipedia.org/wiki/Normal_number_(computing)) value of the given type? Ref: [min returns the minimum positive normalized value](http://en.cppreference.com/w/cpp/types/numeric_limits/min) – chux - Reinstate Monica Dec 07 '17 at 18:07
  • @chux I have cut/copy [cppreference](http://en.cppreference.com/w/cpp/types/numeric_limits). But I do agree, it is the smallest normalized number here. – Picaud Vincent Dec 07 '17 at 18:12
  • 2
    Or alternatively: `unsigned int i = std::numeric_limits::max();` – jamesdlin Dec 08 '17 at 04:04
  • @jamesdin your syntax is certainly better as it's closer to the C usual one -> I will mention your suggestion in my post – Picaud Vincent Dec 08 '17 at 07:57
  • 1
    @underscore_d I have completely rewritten this part. Thanks for pointing out this unclear wording. – Picaud Vincent Dec 08 '17 at 10:25
16

Is -1 correct for using as maximum value of an unsigned integer?

Yes, it is functionally correct when used as a direct assignment/initialization. Yet often looks questionable @Ron.

Constants from limits.h or std::numeric_limits convey more code understanding, yet need maintenance should the type of i change.


[Note] OP later drop the C tag.

To add an alternative to assigning a maximum value (available in C11) that helps reduce code maintenance:

Use the loved/hated _Generic

#define info_max(X) _Generic((X), \
  long double: LDBL_MAX, \
  double: DBL_MAX, \
  float: FLT_MAX, \
  unsigned long long: ULLONG_MAX, \
  long long: LLONG_MAX, \
  unsigned long: ULONG_MAX, \
  long: LONG_MAX, \
  unsigned: UINT_MAX, \
  int: INT_MAX, \
  unsigned short: USHRT_MAX, \
  short: SHRT_MAX, \
  unsigned char: UCHAR_MAX, \
  signed char: SCHAR_MAX, \
  char: CHAR_MAX, \
  _Bool: 1, \
  default: 1/0 \
  )

int main() {
  ...
  some_basic_type i = info_max(i);
  ...
}

The above macro info_max() have limitations concerning types like size_t, intmax_t, etc. that may not be enumerated in the above list. There are more complex macros that can cope with that. The idea here is illustrative.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Good point about the constants (other than -1) breaking when the unsigned int size changes unless using a macro like yours. – Dave S Dec 07 '17 at 18:00
  • The `-1` also needs maintenance if the type changes to a signed one. – jpmc26 Dec 07 '17 at 21:48
  • @jpmc26 True about `-1` and a change to some _signed integer_ type - which is covered with `info_max(i)`. Yet OP's question is primarily about various _unsigned integer_ types and that's where `-1` assignment reduces maintenance. IMO, any use of `some_unsigned_type x = -1;`, at least, obligates an explaining comment. I also see it as a minor problem as a compiler settings may warn about the sign-ess change and I like warning free code. – chux - Reinstate Monica Dec 07 '17 at 22:16
  • @chux Right, not really arguing that this definitively makes one option better than another. Just saying that it's questionable whether -1 needs *less* maintenance if you're talking about type changes, which undermines the perceived advantage it might provide. The macro does seem to eliminate the maintenance concern, at the cost of having a custom macro that someone may then need to go read. Trade-offs everywhere. – jpmc26 Dec 08 '17 at 01:25
  • I'm curious: what's the deal with `default: 1/0`? – aschepler Dec 08 '17 at 04:52
  • @aschepler Unless I'm mistaken, it causes an error if the type isn't in the list. – jpmc26 Dec 08 '17 at 07:27
  • 1
    Does it? I'm not up on `_Generic`, but by my current understanding, dividing by zero just invokes undefined behaviour, which is not a way to trap errors, but rather to heap additional ones onto the problem. – underscore_d Dec 08 '17 at 10:58
  • @aschepler The `default: 1/0` in the _illustrative_ macro will typically cause a compile time error when a type does not match any of the listed types. It can be instead removed to certainly cause a compile time error it that case. Another way to use `default` is to cascade to another `_Generic` that has additional types that may differ from the above list. – chux - Reinstate Monica Dec 08 '17 at 13:34
  • @chux gcc and clang give me a warning, not an error, for `1/0`. – aschepler Dec 08 '17 at 22:56
  • @aschepler True, depending on flags, it might be a warning. – chux - Reinstate Monica Dec 08 '17 at 22:58
9

The technical side has been covered by other answers; and while you focus on technical correctness in your question, pointing out the cleanness aspect again is important, because imo that’s the much more important point.

The major reason why it is a bad idea to use that particular trickery is: The code is ambiguous. It is unclear whether someone used the unsigned trickery intentionally or made a mistake and actually wanted to initialize a signed variable to -1. Should your colleague mention a comment after you present this argument, tell him to stop being silly. :)

I’m actually slightly baffled that someone would even consider this trick in earnest. There’s an unambigous, intuitive and idiomatic way to set a value to its max in C: the _MAX macros. And there’s an additional, equally unambigous, intuitive and idiomatic way in C++ that provides some more type safety: numeric_limits. That -1 trick is a classic case of being clever.

besc
  • 2,507
  • 13
  • 10
  • 2
    Use of `_MAX` macros oblige code maintenance should `i` change from `unsigned short` to `unsigned long`. The `some_unsigned_type i = -1;` "trick" does not need that maintenance. This does not mean that `i = -1;` is a great general purpose idea. Yet it may make sense is select cases that are not silly. The larger context is need to make good judgment - something OP has not presented. – chux - Reinstate Monica Dec 07 '17 at 16:54
  • 3
    @chux True about the context. But I’m confident that legitimate cases of using this trick are few and far between. And even then I wouldn’t want to see it in its naked form, but wrapped in a macro or using a named constant to get rid of the ambiguity. – besc Dec 07 '17 at 17:13
  • I agree on all points in the [comment](https://stackoverflow.com/questions/47698476/is-1-correct-for-using-as-maximum-value-of-an-unsigned-integer/47699866?noredirect=1#comment82360612_47699866). – chux - Reinstate Monica Dec 07 '17 at 17:15
  • 2
    I have used the trick myself, although usually as `0 - 1` rather than `-1`. I just think it looks more intentional that way. – Ian Abbott Dec 07 '17 at 17:37
  • I have encountered only *one* usecase of the -1 trick in my C coding that I consider defensible: when abstracting code that has to be repeated for *all* unsigned integral types into a type generic macro because you need to support C89 or C99 compilers (so you have neither C++'s nor C11's type generic capabilities to help you). – mtraceur Dec 07 '17 at 21:41
8

The C++ standard says this about signed to unsigned conversions ([conv.integral]/2):

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]

So yes, converting -1 to an n-bit unsigned integer will always give you 2n-1, regardless of which signed integer type the -1 started as.

Whether or not unsigned x = -1; is more or less readable than unsigned x = UINT_MAX; though is another discussion (there's definitely the chance that it'll raise some eyebrows, maybe even your own when you look at your own code later;).

rustyx
  • 80,671
  • 25
  • 200
  • 267
  • 1
    I think it's this language that defines the wide unsigned = narrower signed case as performing sign extension. To get zero extension, you have to cast to a narrow unsigned before assigning. examples for x86-64: https://godbolt.org/g/yHj8fC (which uses 2's complement, so this only demonstrates that you get sign-extension (e.g. to all-ones, not `0x00000000FFFFFFFF`) – Peter Cordes Dec 07 '17 at 21:33