Is this a bad way to check if a string represents a floating point number?

Question

Looking around online I didn't find a solution that satisfied me, so I tried it myself. But now in a lecture it was also said that calling functions and waiting for returns might cause the stack to overflow, so is this a bad idea? I use this function to check whether argv[1] is a float or not. Would a loop be better? or is there something way more intuitive? there must be exactly one point and it has to be followed by atleast one digit right?

#include <stdbool.h>
#include <ctype.h>

 /**
 * checks if string is floating point number
 * please call function with pointCounter=0 and digitAfterPoint=false
 */
bool isFloatString(char *s, int pointCounter, bool digitAfterPoint) 
{                                                                   

    if (isdigit(*s))
    {
        if(pointCounter==1)
        {
            digitAfterPoint=true;
        }
        return  isFloatString(s+1, pointCounter, digitAfterPoint);
    }

    else if (*s == '.' && pointCounter==0)
    {
        return isFloatString(s+1, pointCounter+1,digitAfterPoint);
    }
    else if (*s == '\0' && digitAfterPoint)
    {
        return true;
    }
    else
    {
        return false;
    }

}

Except different environments may have a different idea of what constitutes a floating point number. Case in point, c++ compilers are perfectly happy with the following: `const float myVal = 3.f;` Is there anything wrong with using `sscanf`? — enhzflep, Dec 20 '17 at 06:46
[recursive function](https://stackoverflow.com/questions/5250733/what-are-the-advantages-and-disadvantages-of-recursion) calls more keen to stack overflow. — ntshetty, Dec 20 '17 at 06:46
`0x2a.bcp4`, `-2.21l`, `3.0e4L`, `3e-4` are also examples of valid floating-point literals in C — phuclv, Dec 20 '17 at 07:04
@enhzflep: `sscanf` (really, any C library function that parses numbers) is locale-aware and locale is process-wide and not thread-safe (so you cannot just switch to C for a moment and then switch it back), so the only way to parse data in a locale-independent way (e.g. to parse JSON) is to do the job yourself. This is IMO one of the saddest design mistakes of the C standard library, since unlike many others there's no trivial workaround. — Matteo Italia, Dec 20 '17 at 07:16
@MatteoItalia - magic! Thank-you for such an articulate and helpful reply. I so rarely step back into C these days, and when I do it's for personal micro-controller projects. — enhzflep, Dec 20 '17 at 07:23
You could use [`strtod()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtod.html) to do the conversion. It tells you where it stopped converting, so you could decide whether the trailing debris after a number is acceptable. OTOH, it skips leading white space; if you don't want to accept that, you have to detect it before calling `strtod()` — which isn't hard but simply needs to be considered as an issue, if it is an issue. — Jonathan Leffler, Dec 20 '17 at 09:38
I have never seen a more complicated way of checking for a valid float. Unless you are entering an obfuscated C contest, this is a fail! — John3136, Dec 20 '17 at 10:48

Joop Eggen · Accepted Answer · 2017-12-20T10:29:11.067

4

For 999 digits and one point, there are 1000 recursive calls with each return address, and three parameters on the stack. I would find that okay. However a non-recursive iterative solution does away with the state parameters, and is easier to read (in this case only).

bool isFloatString(char *s)
{
    int pointCounter = 0;
    bool digitAfterPoint = false;
    while (*s != '\0')
    {
        if (isdigit(*s))
            digitAfterPoint = pointCounter == 1;
        }
        else if (*s == '.' && pointCounter == 0)
        {
            ++pointCounter;
        }
        else
        {
            return false;
        }
        ++s;
    }
    return digitAfterPoint;
}

Mind: the recursive solution is subject to a malicious stack overflow.

@MatteoItalia rightly indicated that there is only tail recursion (nothing is done with the result), so any mature C/C++ compiler would transform the recursion to jumps (iteration). Here his disassembly (see link in comment too).

isFloatString(char*, int, bool):
  movsx ecx, BYTE PTR [rdi]
  mov r9d, edx
  mov r8d, ecx
  sub ecx, 48
  cmp ecx, 9
  jbe .L23
  cmp r8b, 46
  je .L24
  test r8b, r8b
  sete al
  and eax, edx
  ret
.L24:
  xor eax, eax
  test esi, esi
  je .L25
.L1:
  rep ret
.L23:
  movsx eax, BYTE PTR [rdi+1]
  mov ecx, eax
  sub eax, 48
  cmp esi, 1
  je .L26
  cmp eax, 9
  movzx edx, dl
  jbe .L10
  cmp cl, 46
  je .L27
.L8:
  test cl, cl
  sete al
  and eax, r9d
  ret
.L26:
  cmp eax, 9
  jbe .L28
  xor eax, eax
  cmp cl, 46
  mov r9d, 1
  jne .L8
  jmp .L1
.L28:
  mov edx, 1
.L10:
  add rdi, 2
  jmp isFloatString(char*, int, bool)
.L25:
  movzx edx, dl
  add rdi, 1
  mov esi, 1
  jmp isFloatString(char*, int, bool)
.L27:
  xor eax, eax
  test esi, esi
  jne .L1
  add rdi, 2
  mov esi, 1
  jmp isFloatString(char*, int, bool)

edited Dec 20 '17 at 10:29

answered Dec 20 '17 at 06:58

Joop Eggen

107,315
7
83
138

1

Worth noting that OP solution is actually tail recursive. While it's true that it's not strictly guaranteed, I expect that any compiler worth using will transform the call into a jump. – Matteo Italia Dec 20 '17 at 07:10
@MatteoItalia true (not doing something with the result), but it looks almost as not being tail recursive, and uncertain compiler behaviour should at least ring an alarm bell for a stack overflow. However I would love to see an answer from someone with the generated code, LLVM maybe. (Not pressing you - I have no time too.) – Joop Eggen Dec 20 '17 at 07:24
Of course, in facts it was mostly a technicality; also, the most obvious counterargument against OP's solution is what you said in your answer - the iterative solution in this case is way easier to read. For the code, I tried to provide it using the great gcc.godbolt.org, but unfortunately pasting code there from mobile is seriously broken. I'll add it in 15 minutes, just the time to get to a real computer. – Matteo Italia Dec 20 '17 at 07:29
@MatteoItalia Thanks a lot. As recursion is such a nice thing, I would not see it condemned unecessarily. – Joop Eggen Dec 20 '17 at 07:38
1

... and here it is - https://godbolt.org/g/MHxEDZ . There's no `call`, only straight `jmp`s to the start of the function. – Matteo Italia Dec 20 '17 at 07:54
1

@MatteoItalia much appreciated. – Joop Eggen Dec 20 '17 at 10:22
Thank you guys. Indeed it was the readability that concerned me a lot too, as I hate unreadable code and assumed no one would actually try to read mine, and after learning the idea of recursion I don't see the simplest solutions sometimes. I learned quite a bit from this answer and the comments on it, thank you. Also instead of int I could just use bool I guess. – Duc Nguyen Dec 23 '17 at 19:14

score 0 · Answer 2 · answered Dec 20 '17 at 17:42

check if a string represents a floating point number?

The C standard library provides a simple robust solution using strtof(), strtod(), or strtold()

// leading whitespace OK, trailing text OK, over/underflow OK
bool isFloatString_Simple(const char *s) {
  char *endptr;
  strtof(s, &endptr);
  return endptr > s;
}

Are leading spaces are OK?
Is trailing junk after the numeric text OK?
If overflow a concern?
If underflow a concern?

Then more code is needed. Adjust as needed.

bool isFloatString_Picky(const char *s) {
  char *endptr;
  errno = 0;
  float f = strtof(s, &endptr);
  if (s == endptr) return false; // no conversion

  if (isspace((unsigned char) *s)) return false; // reject leading white-space
  if (*endptr) return false; // reject junk after numeric text

  if (errno) {
    if (fabsf(f) > 1.0f) return false; // reject on overflow
    // yet pass on underflow.
  }
  return true;
}

This is all fine until QApplication (or anybody else, really) did a `setlocale("", LC_ALL)` and that now `strtod` expects `, ` as decimal separator. Now you can no longer use this code to parse e.g. JSON. — Matteo Italia, Dec 20 '17 at 18:07
@MatteoItalia Perhaps. OP's `*s == '.'` implies a required `','`, in which case the locale should be insured for a matching `'.'`. OP's may be interested in a locale adjusting code, in which case, the `strtof()` meets that. The is much left unspecified in OP's top level goal. BTW I think you meant `setlocale(LC_ALL, "")` — chux - Reinstate Monica, Dec 20 '17 at 18:19

Is this a bad way to check if a string represents a floating point number?

2 Answers2