4

strtoull("-1", NULL, 0) evaluates to 18446744073709551615, (0xFFFFFFFFFFFFFFFF aka ULLONG_MAX) on the systems I tested (OS/X with Apple Libc, Linux with Glibc).

Since strtoull is supposed to check for values out of the range of the return type, why does it not return 0 for all negative values?

EDIT: the behavior around -ULLONG_MAX seems inconsistent too:

strtoul("-18446744073709551615", NULL, 0) -> 1, errno=0
strtoul("-18446744073709551616", NULL, 0) -> 18446744073709551615, errno=34
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 1
    https://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoul.html : "[...] If the subject sequence begins with a , the value resulting from the conversion shall be negated. [...]" – pmg Mar 07 '19 at 20:58
  • Because the standards says so? http://port70.net/~nsz/c/c11/n1570.html#7.22.1.4p3 - *If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1, optionally preceded by a plus or minus sign,...* - so any valid integer constant will do. – Eugene Sh. Mar 07 '19 at 20:58
  • Also *If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).* - and the negation in the return type will be like `-1ULL` here. – Eugene Sh. Mar 07 '19 at 21:03
  • Related (maybe duplicate): https://stackoverflow.com/questions/32895853/does-the-implementation-of-strtoul-in-glibc-conflicts-with-the-c11-standard – M.M Mar 07 '19 at 21:05
  • 1
    Note that `18446744073709551615` is `0xFFFFFFFFFFFFFFFF`. – Gwyn Evans Mar 07 '19 at 21:05
  • regarding the EDIT could you explain what your objection is to the observed behaviour? – M.M Mar 07 '19 at 22:37
  • @M.M: the EDIT shows an unexpected discontinuity. If the value is clipped before the negation, the result should be 1, if it is clipped after the negation, the result should be 0. For the result to be `ULLONG_MAX` there must be a test that *prevents* the negation when the absolute value exceeds `ULLONG_MAX`, which is bizarre and counterintuitive. – chqrlie Mar 07 '19 at 22:44
  • The max value of unsigned long long on this system is 18446744073709551615 so I do not see why a discontinuity is unexpected -- converting values in range succeeds, and converting an out of range value gives a range error (you should find 34 is ERANGE). – M.M Mar 07 '19 at 22:54
  • @M.M: IMHO the C Standard specification of the behavior on negative values is confusing for unsigned types and should be reworded to clarify the behavior for out of range values. Common sense dictates that negative values are out of range for an unsigned type. I understand the C Standard committee prefers documenting existing behavior, and I do not question this choice. I just want to underscore how the specification is unclear. – chqrlie Mar 07 '19 at 23:05
  • @chqrlie It clearly specifies that `-1` gives ULLONG_MAX and -ULLONG_MAX gives `1`. The return value specification in case of range error is *slightly* unclear but of the listed options, `ULLONG_MAX` seems more appropriate than any of the other options – M.M Mar 07 '19 at 23:08
  • @M.M: *slightly* unclear indeed. – chqrlie Mar 07 '19 at 23:19

1 Answers1

10

The prevailing interpretation of section C standard (section 7.20.1.4 in C99, section 7.22.1.4 in C11, paragraph 5 in both) is that the conversion is performed in a first step, disregarding the minus sign, producing an unsigned result. This result is then negated. This is suggested by

If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).

in the standard text. Negating values of unsigned type is well-defined, so the overall result is representable if the first step resulted in a representable value. There is no subsequent error due to the negation.

On the other hand, if the input string contains a number that is so large that it cannot be represented as an unsigned long long int value, the first step of the conversion cannot result in a representable value, and paragraph 8 applies:

If the correct value is outside the range of representable values, […] ULLONG_MAX is returned […], and the value of the macro ERANGE is stored in errno.

Again, virtually all implementers interpret the standard in such a way that the the representable value check only applies to the first conversion step, from an arbitrary-precision nonnegative integer in the input string to the unsigned type.

Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • Note that, if this is inconvenient for your application, you can simply parse any leading whitespace or sign first and only pass to `strtoul` the portion of the input beginning with a digit. – R.. GitHub STOP HELPING ICE Mar 07 '19 at 21:31
  • Your reading of the standard is interesting, but should be refined: it does not explain these behaviors: `strtoul("-18446744073709551615", NULL, 0) -> 1, errno=0` but `strtoul("-18446744073709551616", NULL, 0) -> 18446744073709551615, errno=34` – chqrlie Mar 07 '19 at 22:25
  • @chqrlie the question did not mention those behaviours initially. This answer does happen to cover your first example though – M.M Mar 07 '19 at 22:36
  • The standard explicitly says "the sequence of characters **starting with the first digit** is interpreted as an integer constant according to the rules of 6.4.4.1" so I don't see any room for alternative interpretations in your first paragraph – M.M Mar 07 '19 at 22:56
  • @M.M: OK, but the same paragraph goes on saying *If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type)* What does it mean to negate a value in the return type when the value is greater than the maximum value of the type? The reference to 6.4.4.1 is also confusing since `18446744073709551615` for example is too large to parse as a integer constant without a `U` suffix. – chqrlie Mar 07 '19 at 23:15
  • @chqrlie if conversion fails then there is no "value resulting from the conversion", the text you quoted there can only refer to the case of conversion succeeding. – M.M Mar 07 '19 at 23:42
  • 6.4.4.1 doesn't have any limitation on length of integer constant parsing. The table in 6.4.4.1/5 refers to assigning types to an integer constant but that isn't a required step for strtoull which is specified to use unsigned long long. The 6.4.4.1/6 talks more about cases that don't fit in the table of 6.4.4.1/5 – M.M Mar 07 '19 at 23:50