28

How can I detect whether strtol() did not convert a number? I tested it on the following simple case and it outputted 0. The obvious question now is how would I differentiate between a non-conversion and the conversion of 0?

long int li1;
li1 = strtol("some string with no numbers",NULL,10);
printf("li1: %ld\n",li1);

****
li1: 0
Apollo
  • 8,874
  • 32
  • 104
  • 192
  • 2
    [c++ - correct usage of strtol](http://stackoverflow.com/questions/14176123/correct-usage-of-strtol) – DOOM Sep 28 '14 at 05:54

1 Answers1

68

The strtol declaration in stdio.h is as follows:

long int strtol(const char *nptr, char **endptr, int base);

strtol provides a robust error checking and validation scheme that allows you to determine whether the value returned is valid or invalid. Essentially, you have 3 primary tools at your disposal. (1) the value returned, (2) the value errno is set to by the call, and (3) the addresses and contents of nptr and endptr provided to, and set by, strtol. (see man 3 strtol for complete details - the example in the man page also provides a shorter set of conditions to check, but they have been expanded below for explanation).

In your case you ask regarding a 0 return value and determining whether it is valid. As you have seen, a 0 value returned by strtol does not mean that the number read was 0 or that 0 is valid. To determine if 0 is valid, you must also look at the value errno was set do during the call (if it was set). Specifically, if errno != 0 and the value returned by strtol is 0, then the value returned by strtol is INVALID. (this condition will represent either invalid base, underflow, or overflow with errno equal to either EINVAL or ERANGE).

There is a second condition that can result in strtol returning an INVALID 0. The case where no digits were read within the input. When this occurs, strtol sets the value of endptr == nptr. Therefore, you must also check whether the pointer values are equal before concluding a 0 value was entered. (a VALID 0 can be entered with multiple 0's in the string)

The following provides a brief example of the differing error conditions to check when evaluating the return of strtol along with several different test conditions:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <limits.h>

int main (int argc, char **argv)
{
    if (argc < 2) {
        fprintf (stderr, "\n Error: insufficient input. Usage: %s int [int (base)]\n\n", argv[0]);
        return 1;
    }

    const char *nptr = argv[1];                     /* string to read               */
    char *endptr = NULL;                            /* pointer to additional chars  */
    int base = (argc > 2) ? atoi (argv[2]) : 10;    /* numeric base (default 10)    */
    long number = 0;                                /* variable holding return      */

    /* reset errno to 0 before call */
    errno = 0;

    /* call to strtol assigning return to number */
    number = strtol (nptr, &endptr, base );

    /* output original string of characters considered */
    printf ("\n string : %s\n base   : %d\n endptr : %s\n\n", nptr, base, endptr);

    /* test return to number and errno values */
    if (nptr == endptr)
        printf (" number : %lu  invalid  (no digits found, 0 returned)\n", number);
    else if (errno == ERANGE && number == LONG_MIN)
        printf (" number : %lu  invalid  (underflow occurred)\n", number);
    else if (errno == ERANGE && number == LONG_MAX)
        printf (" number : %lu  invalid  (overflow occurred)\n", number);
    else if (errno == EINVAL)  /* not in all c99 implementations - gcc OK */
        printf (" number : %lu  invalid  (base contains unsupported value)\n", number);
    else if (errno != 0 && number == 0)
        printf (" number : %lu  invalid  (unspecified error occurred)\n", number);
    else if (errno == 0 && nptr && !*endptr)
        printf (" number : %lu    valid  (and represents all characters read)\n", number);
    else if (errno == 0 && nptr && *endptr != 0)
        printf (" number : %lu    valid  (but additional characters remain)\n", number);

    printf ("\n");

    return 0;
}

output:

$ ./bin/s2lv 578231

 string : 578231
 base   : 10
 endptr :

 number : 578231    valid  (and represents all characters read)

$ ./bin/s2lv 578231_w_additional_chars

 string : 578231_w_additional_chars
 base   : 10
 endptr : _w_additional_chars

 number : 578231    valid  (but additional characters remain)

$ ./bin/s2lv 578some2more3stuff1

 string : 578some2more3stuff1
 base   : 10
 endptr : some2more3stuff1

 number : 578    valid  (but additional characters remain)

$ ./bin/s2lv 00000000000000000

 string : 00000000000000000
 base   : 10
 endptr :

 number : 0    valid  (and represents all characters read)

$ ./bin/s2lv stuff578231

 string : stuff578231
 base   : 10
 endptr : stuff578231

 number : 0  invalid  (no digits found, 0 returned)

$ ./bin/s2lv 00000000000000000 -2

 string : 00000000000000000
 base   : -2
 endptr : (null)

 number : 0  invalid  (base contains unsupported value)
John Hascall
  • 9,176
  • 6
  • 48
  • 72
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • 1
    Correct me if I'm mistaken, but the initial assignments to number and *endptr aren't necessary right? – Dean Gurvitz Sep 08 '18 at 11:55
  • 2
    Right, but developing good habits like initializing all variables (especially arrays) can avoid inadvertent attempts to access uninitialized values (and *undefined behavior*). – David C. Rankin Sep 08 '18 at 23:21
  • 6
    `0 is returned and errno may be set to [EINVAL]` "robust" – goji Dec 29 '18 at 00:03
  • I wish the C++ version just excepted if it could not convert for any reason. – Zitrax Aug 11 '20 at 06:05
  • 1
    Detail: The discussion about `EINVAL` reflects an non-standard C library implementation extension. Error checking of `strtol()` does not need a test involving `EINVAL` (as suggested by `not in all c99 implementations - gcc OK`), yet may benefit, when `EINVAL` exists to provide additional detail. – chux - Reinstate Monica Sep 27 '20 at 13:26
  • Use of `%lu` is amiss in 7 places. For `long`, use `"%ld"`. – chux - Reinstate Monica Sep 27 '20 at 13:28
  • `else if (errno != 0 && number == 0)` is curious. Why not simply `else if (errno != 0)` or drop the test altogether? What if `errno != 0 && number != 0`? (no final `else` to catch that.) As I see it, an implementation may use `errno` to report details of a problem found by other means (like `EINVAL` when `nptr == endptr`), but is not allowed to create a different error requiring a `errno` test other than `ERANGE`. Hmmmm. – chux - Reinstate Monica Sep 27 '20 at 13:38
  • It does makes sense for an implementation to define what otherwise would be UB and set `errno`, like when `nptr == NULL` or invalid `base` so `else if (errno)` is a good catch of such. – chux - Reinstate Monica Sep 27 '20 at 13:49
  • @DavidC.Rankin in C, variables with a static lifetime are guaranteed to be zero-initialized if no explicit value is provided. In fact, if your compiler doesn't optimize for it (most do though), it can make your executables larger. This likely wouldn't be an issue unless you are in an embedded environment. – John Leuenhagen Oct 03 '20 at 04:38
  • @JohnLeuenhagen Correct you are: [§ 6.7.9 Initialization (p10)](http://port70.net/~nsz/c/c11/n1570.html#6.7.9p10). Micro-controllers require adjustments for just about all buffers, etc.. you would normally not worry about on x86_64. They have become much more prevalent in teaching for programming and engineering courses over the past 6-years. – David C. Rankin Oct 03 '20 at 04:53