0

Changing the dataset to float32 or to float16, is this changing the "true" value of my data?

Example, I have the value 3.6, when I read with python in float64 it remains 3.6, but changing to float32 it becomes 3.5999999046325684, according to IEEE 754. Performing direct data conversion to some kind of float, isn't it changing the real database?

  • 1
    what database ? – njzk2 Jun 17 '21 at 18:23
  • It could be the Higgs UCI Database, I'm currently working on it. – Vitor Gonçalves Jun 17 '21 at 18:29
  • 1
    what do you mean by "true" value of data? when you change the types the bit values must be remapped to new mantissa and exponent because the number of available bits change. Precision going to a lower number of bits would always be less. – Rob S. Jun 17 '21 at 19:53
  • @RobS. I mean, as true, the original dataset. I'm wondering if changing floating point might be valid to improve the accuracy of precision machine learning algorithms – Vitor Gonçalves Jun 17 '21 at 20:21
  • 1
    my answer: theoretically, yes it is changing the true value. practically, probably no it is not changing the true value. the real amount of change in precision based on the example you've given is only ~1/10,000,000th off the original value. so in a practical sense, probably no. – Rob S. Jun 17 '21 at 20:30
  • if I had a base of 10 million examples and changed the entire database to another type of float, would that affect the final accuracy even more? – Vitor Gonçalves Jun 17 '21 at 20:37
  • sorry if it sounds rude, but based on what, do you think so? – Vitor Gonçalves Jun 17 '21 at 20:38
  • 1
    if you change it to a larger base i think you'd be fine. even if you changed it to a lower base you'd PROBABLY be fine. But without knowing the actual content, i cannot tell you for sure and of course going from a 16 bit floating point to an 8 bit floating point would lose a substantial amount of precision. Even your 3.6 "original number" is not exactly 3.6, its as close a representation of 3.6 as a 16 bit representation can get, so converting to a higher base (32 to 64) will just be another value very close but slightly off 3.6. – Rob S. Jun 17 '21 at 20:44
  • 1
    What does “‘true’ value” mean? Any floating-point datum represents one number exactly. Is that what you are asking about? If have some `float64` and convert it to a `float32` or `float16`, then, if the original value is representable in the new format, the result of the conversion has the same value as the operand. If the original value is not representable in the new format, the result of the conversion has a different value. In most languages and software, it is rounded to the nearest value representable in the new format. – Eric Postpischil Jun 18 '21 at 23:41
  • 1
    How are you “Changing the dataset”? Are you converting the entire dataset and writing it to its original store (the file it is in or other long-term storage)? If so, that changes it. Are you converting it but not writing it to the original store? If so, that does not change the real database. Give some context and explain your words. – Eric Postpischil Jun 18 '21 at 23:42
  • @Eric Postpischil i want to use this float change, to evaluate the performance change of machine learning algorithms, if float16 is superior to float32 or 64, something like that. But i'm afraid that with this modification, i'm altering the originality of the data. – Vitor Gonçalves Jun 21 '21 at 16:44
  • 1
    Yes, you will likely *lose precision* by using a lesser-precision floating-point type. Whether that precision matters to you or your application is something that only you can answer. Likely, there will be no performance improvement with a lesser-precision type. At the machine-language level, on modern processors, a 64-bit floating-point value can be manipulated almost as quickly, if not exactly as quickly, as a 32-bit floating-point value (and most hardware doesn't support half-precision floats at all). The only consideration is memory usage. Prioritize *correctness* first. – Cody Gray - on strike Jun 23 '21 at 06:36

0 Answers0