2

Given the following example parsing a string that only contains valid numeric characters, but is to large for type long int[0].

char *endptr = NULL;
long int val = strtol("0xfffffffffffffffffffffffff", &endptr, 0);

It is my understanding that the only way to catch this error is to check if errno is set[1]. Even though errno is supposed to be thread safe[2], this is not the case in most real-time embedded systems[3] where errno is a global int - is this correct? In that case, errno can not be trusted as it could have been modified from interrupt[4] context.

Is there a way to catch this error without using errno or is there another workaround? Implementing a custom strto* do not seems like sane solution, but perhaps there is no other way?

[0] Similar example could be constructed for other functions in strto* family such as strtod, strtof, strtold, strtoll, strtoul and strtoull.

[1] http://man7.org/linux/man-pages/man3/strtol.3.html

[2] Is errno thread-safe? (also related question)

[3] Example in a "bare metal" project or in a low level RTOS such as Apache mynewt, Zephyr or FreeRTOS.

[4] Modified from interrupt or other context that the OS scheduler might provide. I believe these typically only manages the stack and nothing more.

sfrank
  • 121
  • 1
  • 5
  • is your program multithreaded? – OznOg May 11 '20 at 09:51
  • The man page says of `strtol` *"When the representation would cause an overflow, they return LONG_MAX or LONG_MIN."* And please see C18 §7.22.1.4 – Weather Vane May 11 '20 at 09:52
  • 1
    ...so unless `LONG_MAX` or `LONG_MIN` are acceptable inputs, these values tell you there was overflow without using `errno`. – Weather Vane May 11 '20 at 09:59
  • Calling anything that might set errno in an interrupt context, is ill advised in any case. In a multithreaded environment, it is possible that `errno` is _thread local storage_, but that might depend on your library implementation and its correct integration with your RTOS or thread-library. If they are from independent vendors or projects, there are no guarantees unless you have implemented the thread-safety yourself. – Clifford May 11 '20 at 14:46
  • @OznOg if you count interrupts as multithreaded, then yes. – sfrank May 11 '20 at 18:05
  • I think you don't call things that use errno on interrupts handlers, do you? so, not sure you have a problem... – OznOg May 12 '20 at 11:40
  • @OznOg I agree, errno should not be set from interrupts but it can be hard to guarantee it never happen. It is however reasonably that errno is set from another context such as a job, task or whatever your RTOS might call them, – sfrank May 13 '20 at 10:47
  • You may have multitasking on single CPU/threading context, BTW, If you have __real__ multithreaded context, you need memory barriers (or atomic errno) why you lib may not provide. – OznOg May 13 '20 at 12:28

2 Answers2

1

How to check strto...() overflow without errno?

First, I'd doubt OP's assertion: "this is not the case in most real-time embedded systems[2] where errno is a global int - is this correct?"> Be that as it may, here are some ideas.

  1. For floating point (FP), re-writing strto...() is prone to failure. If your FP supports infinity, simply use isinf() to detect overflow - even if the input string was "INFINITY" - or test for such infinity strings.

Overflow for small values near 0.0 is a rabbit hole of issues for strto...(). I assume OP is not looking for overflow there.

  1. For integer strto...(), re-writing is not so hard. Easy to find good source code. If one still wants to avoid that for signed strto...(), call both as in strtol() and strtoul() and compare results. Overflow in the functions return results that do not compare. Works for in long range negatives like "-123" too. For unsigned strto...() - have to think on that.

    bool strtol_no_errno(const char *s, long *num) { *num = strtol(s, NULL, 0); unsigned long unum = strtoul(s, NULL, 0); return unum == (unsigned long) *num; }

  2. For integer strto(u)l(), when long is narrower than long long, just call strto(u)ll() and test if result is in range.


Test

#include <stdbool.h>
#include <stdio.h>

int main( ) {
  char buf[100] = "-1";
  for (int i = 0; i<21; i++) {
    long num;
    bool ok = strtol_no_errno(buf + 1, &num);
    printf("%30s %d %ld\n", buf + 1, ok, num);
    ok = strtol_no_errno(buf + 0, &num);
    printf("%30s %d %ld\n", buf, ok, num);
    strcat(buf, "0");
  }
}

Output

                         1 1 1
                        -1 1 -1
                        10 1 10
                       -10 1 -10
...
        100000000000000000 1 100000000000000000
       -100000000000000000 1 -100000000000000000
       1000000000000000000 1 1000000000000000000
      -1000000000000000000 1 -1000000000000000000
      10000000000000000000 0 9223372036854775807
     -10000000000000000000 0 -9223372036854775808
     100000000000000000000 0 9223372036854775807
    -100000000000000000000 0 -9223372036854775808
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

The first thing you should do is to sanitize the input. For example, if the string length without "0x" is larger than sizeof(long)*2, then the input is obviously too large. So start by verifying that.

Also make sure that if the input has exactly string length of sizeof(long)*2, then input[0] must be '7' or smaller for signed long. Or in case input[0] is '-', check input[1]. (Some 2's complement trickery to consider here, I'll leave that to you.)

With the above sanity checks, it shouldn't be possible to get an overflow. But if you do, the function is guaranteed to return LONG_MAX (or LONG_MIN for underflow). Also, endptr is set to the beginning of the input string in case of such errors. So you can check if the conversion failed like this:

if(val != LONG_MAX) 
{ 
  /* OK, normal case */ 
}
else // val == LONG_MAX
{
  if(endptr != input &&
     (size_t)(endptr-input) == sizeof(long)*2))
  {
    /* OK, input was LONG_MAX but no overflow */ 
  }
  else
  {
    /* overflow error */
  }
}

This assuming that input doesn't have the 0x prefix - if it has, you'll have to tweak the code accordingly.

Same check is needed for underflow.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • `endptr` is not set to the beginning of the string there is an overflow. If the string starts with optional white space and characters forming an integer constant with some controls according to the base, then `endptr` is set to the first character after that, regardless of overflow. Only if no characters founding an integer constant are found is `endptr` set to the beginning of the string. So `endptr` is useless for detecting overflow. Additionally, if the length of the string after `0x` exceeds `sizeof(long)*2`, that does not guarantee overflow. There may be leading zeros. – Eric Postpischil May 11 '20 at 20:16
  • There is an excess closing parenthesis in the second `if`. – Eric Postpischil May 11 '20 at 20:16
  • The question gives an example starting with `0x` but does not assert all inputs will have it. – Eric Postpischil May 11 '20 at 20:17