Error with converting a binary input from the user into a double value

Question

so below I've written code to convert a binary value into its double equivalent. However,

binConversionDouble("0111111111111111111111111111111111111111111111111111111111111111")

gives:

179769313486149841153851976955417028335471708986564534469802644816832731470450734933617524554272984142853535521554773891036051644209223511432842829043130247900910146729889177843938143131935935774382721844130004287345894163215696051477671359689698349651554878795894806567601614971014045300870721438977577975808.000000

but the actual value of DBL_MAX is:

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000

My logic seems to be correct however I am unsure whether the above is a case of loss of precision in arithmetical calculations or a flaw in my logic. Can someone please guide me as to why the two values are different?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>
#include <limits.h>
#include <math.h>
#include <float.h>

double binConvertDouble(const double *mainBinArr, const char usersBinVal[]) {

    double isNegative = 1;
    if (usersBinVal[0] == '1') {isNegative = -1;}

    double exponent = 0;
    char exponentSubString[12];
    int count = 0;
    for(int i =0; i<11; i++) {
        exponentSubString[i] = usersBinVal[i+1];
        count = count + 1;
    }
    exponentSubString[11] = '\0';
    exponent = calculatingDoublesExp(mainBinArr, exponentSubString);

    double summingMantissa = 0;
    for(int i =12; i<52; i++) {
        if (usersBinVal[i] == '1') {
            summingMantissa = summingMantissa + pow(2, 11-i);
        }
    }
    double totalMantissaVal = 0;
    totalMantissaVal = 1 + summingMantissa;
    double actualExp = pow(2, (exponent-1023));
    double finalVal = isNegative * totalMantissaVal * actualExp;
    return finalVal;
}


int main() {
  printf("VALUE WE GET %f\n", binConversionDouble("0111111111111111111111111111111111111111111111111111111111111111"));
return 0;
}

What did you do to debug your code? Please [edit] your question to add new information, don't add a comment. — the busybee, Nov 09 '21 at 17:15
Is all of this code really necessary in order to reproduce the problem? If possible, please provide a [mre], with emphasis on the word "minimal". Please read the instructions in the provided link for guidance on how to create such an example. — Andreas Wenzel, Nov 09 '21 at 17:32
Have you tried running your code line by line in a debugger while monitoring the values of all variables, in order to determine at which point your program stops behaving as intended? If you did not try this, then you may want to read this: [What is a debugger and how can it help me diagnose problems?](https://stackoverflow.com/q/25385173/12149471) Even if using a debugger does not actually solve the problem, it should at least help you to isolate the problem and to create a [mre] of the problem, so that it will be easier for other people to help you. — Andreas Wenzel, Nov 09 '21 at 17:35
The input string represents an IEEE-754 NaN value. `DBL_MAX` would be represented by the input string `"0111111111101111111111111111111111111111111111111111111111111111"`. — Ian Abbott, Nov 09 '21 at 17:47
"However, binConversionDouble("0111111111111111111111111111111111111111111111111111111111111111") gives inf not DBL_MAX." --> avii, why do you think this should result in `DBL_MAX`? It shouldn't. `DBL_MAX` has a different binary. — chux - Reinstate Monica, Nov 09 '21 at 17:47
@IanAbbott and @ chux -Reinstate Monica I have used what you said and am now getting what I want but only accurate by a few digits. — avii, Nov 09 '21 at 18:20
So the value printed is 179769313486149841153851976955417028335471708986564534469802644816832731470450734933617524554272984142853535521554773891036051644209223511432842829043130247900910146729889177843938143131935935774382721844130004287345894163215696051477671359689698349651554878795894806567601614971014045300870721438977577975808.000000. — avii, Nov 09 '21 at 18:21
But double max is But double max is 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000. Is it normal for this to happen due to loss of precision somewhere in the arithmetic or is my logic wrong? — avii, Nov 09 '21 at 18:22
@AndreasWenzel I have tried posting the question again, thank you for your feedback. — avii, Nov 09 '21 at 18:27
The value printed has the 12 least significant bits all zero. The printed value is `0x1.ffffffffff000p+1023`. `DBL_MAX` is `0x1.fffffffffffffp+1023`. That is probably because you have `i<52` instead of `i<64` in your `for` loop. — Ian Abbott, Nov 10 '21 at 10:09

Ian Abbott · Accepted Answer · 2021-11-10T10:46:21.683

The example input string "0111111111111111111111111111111111111111111111111111111111111111" corresponds to an IEEE-754 double NaN (Not-a-Number) value. The exponent part of the string is "11111111111" corresponding to a (radix-2) exponent of 1024 plus the zero-offset bias of 1023. However the maximum (radix-2) exponent of an IEEE-754 finite, numeric value is 1023. The out-of-range exponent causes pow(2, exponent-1023) to return inf when exponent is 2047 (exponent includes the zero-offset of 1023 here).

The IEEE-754 DBL_MAX value is represented by the input string "0111111111101111111111111111111111111111111111111111111111111111".

The code that extracts the significand (mantissa) value from the input string is terminating too early:

    double summingMantissa = 0;
    for(int i =12; i<52; i++) {
        if (usersBinVal[i] == '1') {
            summingMantissa = summingMantissa + pow(2, 11-i);
        }
    }

The terminating condition should be i<64.

Error with converting a binary input from the user into a double value

1 Answers1