-1

I am making an implementation of a MobileNetV2 in C and comparing it, layer by layer, to a Keras model of the same network to make sure I'm doing things right. I managed to get to a very close approximation of the inference result of the network, but there is an error in the 5th decimal place or so. Looking for the reason for the imprecision I came across something strange.

I am working exclusively with float objects in C, and all of the arrays in Python, including all of the weight arrays and other parameters, are float32.

When exporting my processed image from Python to a .csv file, I put seven decimal points to the export function: np.savetxt(outfile, twoD_data_slice, fmt='%-1.7e') which would still result in a float but with certain limitations. Namely, that last decimal place does not have full precision. However, one of the numbers I got was "0.98431373". When trying to convert this to C it instead gave me "0.98431377".

I asked a question here about this result and I was told of my mistake to use seven decimal places, but this still doesn't explain why Python can handle a number like "0.98431373" as a float32 when in C that gets changed to "0.98431377".

My guess is that Python is using a different 32-bit float than the one I'm using in C, as evidenced by how their float32 can handle a number like "0.98431373" and the float in C cannot. And I think this is what is causing the imprecision of my implementation when compared to the final result in Python. Because if Python can handle numbers like these, then the precision it has while doing calculations for the neural network is higher than in C, or at least different, so the answer should be different as well.

Is the floating point standard different in Python compared to C? And if so, is there a way I can tell Python to use the same format as the one in C?


Update

I changed the way I import files using atof, like so:

void import_image(float data0[224][224][3]) {

    // open file
    FILE *fptr;
    fptr = fopen("image.csv", "r");
    if (fptr == NULL) {
        perror("fopen()");
        exit(EXIT_FAILURE);
    }

    char c = fgetc(fptr); // generic char
    char s[15];           // maximum number of characters is "-x.xxxxxxxe-xx" = 14

    for (int y = 0; y < 224; ++y) {         // lines
        for (int x = 0; x < 224; ++x) {     // columns
            for (int d = 0; d < 3; ++d) {   // depth
                // write string
                int i;
                for (i = 0; c != '\n' && c != ' '; ++i) {
                    assert( 0 <= i && i <= 14 );
                    s[i] = c;
                    c = fgetc(fptr);
                }
                s[i] = '\0';
                float f = atof(s);      // convert to float
                data0[y][x][d] = f;     // save on array
                c = fgetc(fptr);
            }
        }
    }
    fclose(fptr);
}

I also exported the images from python using seven decimal places and the result seems more accurate. If the float standard from both is the same, even the half-precision digit should be the same. And indeed, there doesn't seem to be any error in the image I import when compared to the one I exported from Python.

There is still, however, an error in the last digits of the final answer in the system. Python displays this answer with eight significant places. I mimic this with %.8g.

My answer:

'tiger cat', 0.42557633
'tabby, tabby cat', 0.35453162
'Egyptian cat', 0.070309319
'lynx, catamount', 0.0073038512
'remote control, remote', 0.0032443549

Python's Answer:

('n02123159', 'tiger_cat', 0.42557606)
('n02123045', 'tabby', 0.35453174)
('n02124075', 'Egyptian_cat', 0.070309244)
('n02127052', 'lynx', 0.007303906)
('n04074963', 'remote_control', 0.0032443653)

The error seems to start appearing after the first convolutional layer, which is where I start making mathematical operations with these values. There could be an error in my implementation, but assuming there isn't, could this be caused by a difference in the way Python operates with floats compared to C? Is this imprecision expected, or is it likely an error in the code?

Ricardo
  • 65
  • 5
  • You asked the Python code to output 7 decimal places, but the example you show `0.98431373` has 8. – Weather Vane Oct 19 '21 at 23:57
  • @WeatherVane It outputted 9.8431373e-01, so 7 decimal places but 8 significant digits. – Ricardo Oct 20 '21 at 00:06
  • I know it doesn't answer the question, but if you are fighting with the last digit, you are using the wrong type. Accept that floating point is fundamentally imprecise and use a size at least as good as you need. Better, never use `float` at all. Anyway, isn't a neural network "fuzzy logic"? – Weather Vane Oct 20 '21 at 00:09
  • It does not have to be a different float format, IEEE754 is configurable, with things like rounding modes and other stuff that you can configure in your C code, you would have to check how python and even numpy deals with these edge cases. – Dr. Snoopy Oct 20 '21 at 00:11
  • @Dr.Snoopy I see. And I'm assuming there isn't a way to change these settings in Python? – Ricardo Oct 20 '21 at 00:18
  • Hang on! In that other question you were asking about the number 9.8431373, which can't be represented as a float in C, being "rounded" to 9.8431377. But now you're asking about 0.98431373, and that *can* be represented pretty accurately as a float in C. – Steve Summit Oct 20 '21 at 00:42
  • @SteveSummit So, the way I import the data to C is I take the mantissa, convert it to float, then take then exponent, convert it to long, and then multiply the mantissa by 10^exponent. So the number that I received was 9.8431373 which got turned into 9.8431377 and then into 0.98431377 after multiplying it by the mantissa. I thought this would be the same case for a number one exponent below, but you're telling me I could actually represent 0.98431373 in C if I changed my code then? – Ricardo Oct 20 '21 at 00:48
  • Why are you multiplying by 10^exponent yourself? Functions like `atof` and `strtod` can handle exponential notation just fine, and they can probably do a better job. – Steve Summit Oct 20 '21 at 01:00
  • Yes, although 9.8431373 by itself gets "rounded" to 9.8431377, I see that "9.8431373e-1" should convert to 0.984313726. – Steve Summit Oct 20 '21 at 01:03
  • @SteveSummit Oh, I didn't know `atof` could do that, I'll give it a try. This may even correct the imprecision in my implementation. – Ricardo Oct 20 '21 at 01:10
  • 1
    Edit the question to provide a [mre]. Describing the code as “When trying to convert this to C” but only later revealing in the comment you do not do a single integrated conversion as with `strtod` or similar built-in routine but rather do multiple calculations wastes people’s time. You have paragraphs of useless text in the question when all that may be necessary to illustrate the problem is to simply show the lines of code you use to do the conversion, the input you give them, and the output obtained. – Eric Postpischil Oct 20 '21 at 01:10
  • @EricPostpischil If I knew a single integrated conversion existed, I would have used it. You can blame me for not explaining my question correctly, but not for not knowing something about the language. I'm here to learn. And I just learned something new, so it was not wasted time. Either way, I'll take your criticism and change the question after I have tested the solution and see if the problem still happens. – Ricardo Oct 20 '21 at 01:17
  • @WeatherVane Saying that floating point is "fundamentally imprecise" is a bit of an overstatement. Some aspects of floating point can be perfectly precise, and round-tripping a particular value (that's already a float) should be one of them. – Steve Summit Oct 20 '21 at 03:48
  • @Ricardo To accurately "round trip" a float in every case -- that is, from a binary representation, to a decimal string, and back to hopefully the same binary representation -- I believe you need 9 significant digits. See [this nice explanation](https://stackoverflow.com/questions/61609276/how-to-calculate-float-type-precision-and-does-it-make-sense/61614323#61614323). – Steve Summit Oct 20 '21 at 03:49
  • @SteveSummit I would consider the exact representations as special cases. They don't change the general idea that one should not expect precision, but work in the knowledge that they are typically inexact. – Weather Vane Oct 20 '21 at 06:50

1 Answers1

1

A 32-bit floating point number can encode about 232 different values.
0.98431373 is not one of them.
Finite floating point values are of the form: some_integer * power-of-two.

The closest choice to 0.98431373 is 0.98431372_64251708984375 which is 16514044 * 2-24.

Printing 0.98431372_64251708984375 to 8 fractional decimal places is 0.98431373. That may appear to be the 32-bit float value, but its exact value differs a small amount.


in C that gets changed to "0.98431377"

0.98431377 is not an expected output of a 32-bit float as the next larger float is 0.98431378_6029815673828125. Certainly OP's conversion code to C results in a 64-bit double with some unposted TBD conversion artifacts.

"the way I import the data to C is I take the mantissa, convert it to float, then take then exponent, convert it to long, and then multiply the mantissa by 10^exponent" is too vague. Best to post code than only a description of code.


Is the floating point standard different in Python compared to C?

They could differ, but likely are the same.

And if so, is there a way I can tell Python to use the same format as the one in C?

Not really. More likely the other way around. I am certain C allows more variations on FP than python.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256