Writing my own float parser

Question

I am trying to write a parser in C and part of its job is to convert a series of characters into a double. Up to now I have been using strtod but I find it to be quite dangerous and it won't handle cases where the number is at the end of the buffer, which is not null terminated.

I thought I'd write my own. If I have a string representation of a number of the form a.b, will I be nieve to think that I can just calculate (double)a + ((double)b / (double)10^n), where n is the number of digits in b?

For example, 23.4563:

a = 23 b = 4563

final answer: 23 + (4563/10000)

Or would that produce inaccurate results with regard to the IEEE format of floats?

For one, if `b` is an integer type, `b / 10^n` is going to get rounded before being cast to `float`. You'd want to put the cast _inside_ the parentheses. Also, you'd need to make sure that there is no integer overflow in either of `a` or `b`. — Shahbaz, Mar 21 '13 at 17:44
More to think about : Negative numbers, Exponent format `1.2E10`. negative exponents,... I *really* suggest you just copy into a null terminated buffer and let strtod do the heavy lifting. — Roddy, Mar 21 '13 at 17:58
Calculating `a + b/10000.f` will not always produce a correctly rounded result. (For these purposes, the “correctly rounded result” is the one nearest the mathematical value.) An example is “1.0097”. `1 + 97/10000.f` produces 1.0097000598907470703125, but the closer, and correct, `float`, is 1.00969994068145751953125. The incorrect result occurs because there is one rounding in the division and another in the addition. The rounding in the division loses information that shows the other value would have been desired in the addition. — Eric Postpischil, Mar 21 '13 at 18:30
Your question says your job is to convert to a **double** but elsewhere refers to casting to **float**. You should clarify what type you want to produce. — Eric Postpischil, Mar 21 '13 at 18:36
An example where `a+b/10000.` fails to produce the correct `double` is 1.0131. — Eric Postpischil, Mar 21 '13 at 18:40
Don't. Either copy to a null terminated buffer or get and adapt an existing correct implementation. — R.. GitHub STOP HELPING ICE, Mar 21 '13 at 19:36
I suggest you start reading http://www.exploringbinary.com/tag/convert-to-binary/ . Stop as soon as you don't feel like implementing your own decimal-to-floating-point conversion anymore. — Pascal Cuoq, Mar 21 '13 at 19:59

Eric Postpischil · Accepted Answer · 2021-03-11T22:31:03.670

4

It is hard to read floating-point numerals accurately, in the sense that there are various problems that must be carefully addressed, and many people fail to do so. However, it is a solved problem. To start, see How to read floating point numbers accurately, June 1990, by William D. Clinger.

I agree with Roddy, you are likely better off copying the data into a buffer and using existing library functions. (However, you should check that your C implementation provides correctly rounded conversion of floating-point numerals. The C standard does not require it, and some implementations do not provide it.)

edited Mar 11 '21 at 22:31

answered Mar 21 '13 at 18:19

Eric Postpischil

195,579
13
168
312

The paper at the link is gone. archive.org still had it: https://web.archive.org/web/20170329102230/http://www.cesura17.net/~will/Professional/Research/Papers/howtoread.pdf It seems to be "How to Read Floating-Point Numbers Accurately." by William D. Clinger https://doi.org/10.1145/93548.93557 https://dblp.org/rec/conf/pldi/Clinger90 – Caesar Mar 11 '21 at 08:53

score 1 · Answer 2 · edited May 23 '17 at 11:57

1

You may be interested in this answer of mine to a somewhat related question.

The parser in that answer converts decimal floating point numbers (represented as strings) into IEEE-754 floats and doubles with proper rounding.

As far as I remember, about the only issue in the code is that it may not handle the cases when the exponent part is too big (doesn't fit into an integer) and should amount to returning either an error or INF.

Otherwise, it should give you a good idea of what to do (if you have any idea at all of what you're doing:).

edited May 23 '17 at 11:57

Community

1
1

answered Mar 21 '13 at 20:42

Alexey Frunze

61,140
12
83
180

Nice code. One remark though: you say “the cases when the exponent part is too big (doesn't fit into an integer) and should amount to returning either an error or INF.” That's funny, because I too wrote my own decimal to floating-point function (a shorter one than yours because I was able to rely on an existing bigint implementation), and I made the same mistake as you, that is, assuming that an exponent too large to fit in an int means the float is an infinite. http://blog.frama-c.com/index.php?post/2012/11/19/Funny-floating-point-bugs-in-Frama-C-Oxygen-s-front-end – Pascal Cuoq Mar 21 '13 at 22:21

score 1 · Answer 3 · edited May 23 '17 at 11:49

1

As already said, it's difficult, you need extra precision, etc...

But if you have restricted inputs, and want to know if you can still correctly convert these restricted decimal to binary with semi naive algorithm and standard IEEE 754 ops, you might be interested in my answer to

How to manually parse a floating point number from a string

edited May 23 '17 at 11:49

Community

1
1

answered Mar 21 '13 at 23:02

aka.nice

9,100
1
28
40

Writing my own float parser

3 Answers3