0

I want to parse floating-point numbers from scratch in C.

But I found there is an obviously error because of the storage of float/double in computers is not precise enough.

Here is my code for parsing, regardless of the conditions that involves negative sign:

void parseFloat(double *coe, int *exp){

    char c = 0;
    double digit = 10;
    *coe = 0, *exp = 0;
    int state = 0;
    while((c = getchar_unlocked()) !='\n'){
        if(c == '.'){
            state = 1;
            continue;
        }
        if(c == 'e'){
            state = 2;
            continue;
        }
        if(state == 0){
            *coe = *coe * 10 + c - '0';
        }else if(state == 1){
            *coe += (c - '0') / digit;
            digit *= 10;
        }else if(state == 2){
            *exp = *exp * 10 + c - '0';
        }else{
            *coe = 0, *exp = 0;
            break;
        }
    }

    return;
}

And it got the totally intolerable wrong result because of the addition of trivial mistakes made by each loop step in parsing process:

        input        ->          output
5.699141892149156e76 -> 5.699141892149156341 76
9.205357638345294e18 -> 9.2053576383452959675 18

So is there any decent method for parsing floating-point numbers more precisely?

And how did the build-in library like scanf("%f", &fpn) implement it?

Thanks a lot!

too honest for this site
  • 12,050
  • 4
  • 30
  • 52
Daizy
  • 331
  • 2
  • 5
  • 12
  • C and C++ are different languages. As you ask for the C function `scanf`, I think you mean C. – too honest for this site Jun 16 '16 at 13:05
  • 1
    `float` and `double` contain errors, so don't use them and have the value as array of digits. – MikeCAT Jun 16 '16 at 13:06
  • If you want to take in a double as input, you should use `scanf ("%lf",&fpn)`. That will give more precision. – Rishikesh Raje Jun 16 '16 at 13:06
  • @Olaf Thanks, I'll be more careful next time. – Daizy Jun 16 '16 at 13:09
  • @MikeCAT Thanks, but keeping them in array makes math operations really difficult, so I want to convert it to some built-in type. – Daizy Jun 16 '16 at 13:13
  • @RishikeshRaje Thanks! Build-in function ```scanf``` is not flexible enough for my condition, but I think the implement of it can give me great inspiration. Do you know something about it? – Daizy Jun 16 '16 at 13:17
  • 2
    @daizy The actual answer to your question (since I don't believe it's a duplicate), is that it's a surprisingly hard to do this right. This paper should be a good start: http://kurtstephens.com/files/p372-steele.pdf follow the references and newer papers that reference it. – Art Jun 16 '16 at 13:21
  • @Olaf RadLexus unwind I think the question is not duplicate because I'm not asking why the parsing answer is wrong but how to parse it correctly. I think the [http://stackoverflow.com/questions/588004/is-floating-point-math-broken](Is floating point math broken?) doesn't solve my problem.Thanks – Daizy Jun 16 '16 at 13:23
  • 1
    @Daizy your "intolerable wrong results" only diverge after the 15th digit, which is about as much significance as you can expect after 15 incremental errors. The type `double` can only hold 15-17 digits anyway. – Weather Vane Jun 16 '16 at 13:26
  • @Art Thank you! It seems to really useful to my problem. I'll read it carefully! – Daizy Jun 16 '16 at 13:26
  • If their licence is acceptable for you, it is good to use libraries such as [GNU MP](https://gmplib.org/). – MikeCAT Jun 16 '16 at 13:30
  • I actually voted as too broad. But the dup is acceptable, because it does explain what your described problem is. Did you actually read and understand it? – too honest for this site Jun 16 '16 at 13:40
  • @Olaf I knew the floating-point numbers cannot be stored precisely in memory. But the parsing method like mine introduced a mistake in each step when adding every digits to the result which will make result not the best one the computer can store. Yeah, I'm weird. I think I should give up to find some better ways. – Daizy Jun 16 '16 at 14:03
  • 1
    @Daizy: Oh, they can be store precisely! - Within the restrictions of the representation. You might know about the conversion and rounding problems, but you seem not to have understood **all implications** (no offence, that is no easy stuff for a beginner). – too honest for this site Jun 16 '16 at 14:06
  • 1
    An interesting problem and would benefit from re-opening yet it lacks some crucial data: 1) How was the computed `double` outputed? `parseFloat()` computes the `double` yet does not print it. AFAIK, additional errors crept in because of a weak printing. Use `printf("%a\n", x);` 2) The `double` computed by this code will not match the exact strings as `double` can only represent exactly about 2**64 different numbers and of course there are infinite number of numbers and the 2 given are not in the `double` set. What error is tolerable? – chux - Reinstate Monica Jun 16 '16 at 14:23
  • ... 3) still OP's code can get better - although it is work and may be suited for a later programing task. 4) The 2 numbers provided are not random, printf them with `printf("%a\n",x)` to see how they are _special_. 5) Print you results to see how your results were 1 or 2 [ULP](https://en.wikipedia.org/wiki/Unit_in_the_last_place) off – chux - Reinstate Monica Jun 16 '16 at 14:24
  • BTW: A _simple_ improved precision method with typical `double` it to calculate the signicand using an `unsigned long long --> sig` noting the decimal point --> `offset`. then `sign*sig*pow(10, expo - offset)`. To get the best answer takes work - beyond a comment. – chux - Reinstate Monica Jun 16 '16 at 14:29
  • @chux I'm really grateful for reading your comment. It makes me realize that I'm such not good in making a question and cannot think about problems as sensible and logical as yours. – Daizy Jun 16 '16 at 14:46
  • @chux I used too many emotional words like 'intolerable', and do not describe my question clearly. I will modify the question recently and do a little more experiment on it. Thanks a lot! – Daizy Jun 16 '16 at 14:48
  • Recommend a new question with specific details of input/output/expected output/tolerances.etc. and code that can, by itself (this code had no output), repeat the issue. Be sure to show how it differs from a sea of like questions - else do not post. – chux - Reinstate Monica Jun 16 '16 at 15:01

0 Answers0