2

My question is regarding the solution posted by 6502 on [c++ stringstream is too slow, how to speed up? Parsing Float method posted by him is fast but its not giving correct result on large double values.

Input:

132.12345645645645

Output:

132.123

double parseFloat(const std::string& input){
    const char *p = input.c_str();
    if (!*p || *p == '?')
        return NAN_D;
    int s = 1;
    while (*p == ' ') p++;

    if (*p == '-') {
        s = -1; p++;
    }

    double acc = 0;
    while (*p >= '0' && *p <= '9')
        acc = acc * 10 + *p++ - '0';

    if (*p == '.') {
        double k = 0.1;
        p++;
        while (*p >= '0' && *p <= '9') {
            acc += (*p++ - '0') * k;
            k *= 0.1;
        }
    }
    if (*p) die("Invalid numeric format");
    return s * acc;
}

I want to know is there any way to change this function to get correct result.

Community
  • 1
  • 1
sv_jan5
  • 1,543
  • 16
  • 42
  • 2
    How about [`std::stod`](http://en.cppreference.com/w/cpp/string/basic_string/stof)? Or the old [`std::strtod`](http://en.cppreference.com/w/cpp/string/byte/strtof)? At least they are correct. Speed might be good in some circumstances, but there's always a trade-off, in this case it *might* be speed versus correctness. – Some programmer dude Jul 26 '14 at 13:26
  • 3
    Most likely you are just confusing the number that was parsed, and the output that was printed. For some reason only three decimals are printed. – gnasher729 Jul 26 '14 at 13:28
  • but this method is 10x faster compared to strtod – sv_jan5 Jul 26 '14 at 13:28
  • 2
    Also remember that when doing multiple calculations with floating point numbers on a computer, you will get rounding errors and they compound and becomes bigger the more calculations you do. – Some programmer dude Jul 26 '14 at 13:29
  • I commented on the question why you are only getting three decimals. You should be aware that the code you posted would be unacceptable for serious applications due to avoidable rounding errors. Getting bad results quickly is rarely acceptable. – gnasher729 Jul 26 '14 at 13:32
  • 2
    As for the speed increase, have you measured it? And what if the input is not valid? I.e. what would happen if the string is not a number? Or e.g. `"-a"`? Or contains non-digit values at the end? All those cases are handled properly in the standard functions, while this function will fail in possible undefined ways. – Some programmer dude Jul 26 '14 at 13:33
  • Which method will be best for parsing string to float without rounding errors in least possible time? – sv_jan5 Jul 26 '14 at 13:36

2 Answers2

8

If you're printing the result with cout, make sure you set the necessary precision:

cout.precision(20);

live example

However as you noted a simpler method would be:

double string_to_double( const std::string& s )
 {
   std::istringstream i(s);
   double x;
   if (!(i >> x))
     return std::numeric_limits<double>::quiet_NaN(); // Or anything you prefer
   return x;
 }

Notice: due to roundoff errors, exceeding with the number of digits might not yield 100% expected results on the rightmost part.

Marco A.
  • 43,032
  • 26
  • 132
  • 246
  • 1
    It would be better to `throw` in case of error (instead of returning valid value `0`). – Jarod42 Jul 26 '14 at 13:31
  • I agree, or a NaN of some sort (not sure how `NAN_D` is implemented in OP's code) – Marco A. Jul 26 '14 at 13:34
  • Is this method sufficiently fast or are there other better options? – sv_jan5 Jul 26 '14 at 13:44
  • 1
    @user3756606 You can benchmark this code and your version, I don't know which one is faster but surely you should also consider the acceptable error threshold and probably also the error handling capabilities as others have noted. – Marco A. Jul 26 '14 at 13:49
1

The scaling for the digits after the decimal point is significantly worse than useless.

Multiplying by 0.1 is a mistake... not least because 0.1 is not represented exactly in binary floating point. So going round in a loop, repeatedly multiplying by 0.1 is accumulating more and more error.

For big numbers, when multiplying the number so far by 10, and adding the next digit value, you are going to accumulate more errors :-(

The trick is to accumulate the digits as an unsigned long long (assuming that's at least 64 bits), ignoring the . but counting the number of digits after it. When you have all the digits, float the integer and then divide by the required power of 10. Now you have at most one rounding (in the float, if you have more than 53 bits of value) and a second in the division. But, things can get complicated... if you have a number whose value exceeds what you can fit in an unsigned long long, you need to stop adding digits (and round)... if that happens before the ., you need to count digits up to it, and then multiply by a power of 10.

NB: decimal to binary floating point conversion is hard in the general case. If you are converting numbers without exponents, and the number of digits is reasonable, it's not too bad. The problem that bites you is how to create large powers of 10 to multiply/divide by, without introducing (unacceptable) errors.