0

I'm reading values from a text file and converting them to doubles, then, I'll need to print these values on screen, and they should look exactly the same as in the text file.

Currently, I'm using max precision when printing, and I might get results like this:

text value: 8.994279313857e-317 printed value: 8.99427931385706e-317

Each text file may have different precision, or even different precision for different values (in the same text file), and I can't store these values as strings for memory concern.

My thought is to store them as doubles with another unsigned int for precision, is there a way to get the precision out of a string number?

nancyheidilee
  • 322
  • 1
  • 2
  • 10
  • *"I'm reading values from a text file and converting them to doubles, then, I'll need to print these values on screen, and they should look exactly the same as in the text file."* - This necessitates, that you store your numbers as strings. Assuming that your text file stores base-10 numbers, and floating point numbers are stored as base-2 on your system, you cannot represent ever source number in your destination encoding. You'll have to keep the strings around. This is all just premature optimization, so scrap it, and store the numbers as strings. It is **that** simple, really. – IInspectable Jul 20 '16 at 18:59
  • You could try boost.multiprecision maybe. But storing them as strings may be the easiest solution. – Jonathan Potter Jul 20 '16 at 19:03
  • The case is that we need to use minimal memory to achieve the requirements, so strings just won't do :S – nancyheidilee Jul 20 '16 at 19:05
  • Just a thought - if memory is your main concern, can you simply process the file in pieces? – D Hydar Jul 20 '16 at 19:05
  • If strings won't do (and I'm sure you **haven't** profiled this), then you cannot care about correctness. You have to pick: A program that properly works, or a program that doesn't use strings. – IInspectable Jul 20 '16 at 19:06
  • Sorry, to be specific, after reading in the text files, we convert the values into doubles and write them as binary files. When the binary files are loaded back into memory, it should be able to provide exact same features as when data was loaded from the text files. We can design the binary file format on our own, but the goal is to keep the file size as small as possible. – nancyheidilee Jul 20 '16 at 19:09
  • You cannot maintain the same information, once you convert strings to floating point values. The conversion is lossy. Either keep the strings or drop the requirement. – IInspectable Jul 20 '16 at 19:10
  • I realise that information is lost once the conversion is done, but is there any way I could get the precision out of the original string? (before information is lost) – nancyheidilee Jul 20 '16 at 19:12
  • It's not even clear what you mean by "get the precision out of the original string". Using the numbers you've shown above, which "part" of them are you hoping to extract? – Jonathan Potter Jul 20 '16 at 19:24
  • 1
    You could count the number of digits in the original number, and then use that with `setprecision()` when you print the number out later. – Barmar Jul 20 '16 at 19:25
  • 1
    I'm not even sure you realize, how `8.994279313857e-317` is well outside the range representable by double precision IEEE 754 floating point numbers. Maybe you need to take a step back, and think this through again. [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) is a good read. – IInspectable Jul 20 '16 at 20:02
  • You can of course create a dedicated `numberstring` class that stores 4 bits per digit. You also need the `+-e.`symbols, but that's still only 14 characters altogether. – MSalters Jul 20 '16 at 20:08

2 Answers2

1

You can't have your cake and eat it too! :)

I mean, you want the exact representation of a string in a number, without having to pay the cost in terms of memory (by just storing the string*).

So in that case the best thing you could do is to use a long double, you know the L we append at the end of numeric constants. However, of course this also has its limits (..since our computer also has limits). Moreover, what a waste of space (and processing time) would it be to use a long double for numbers that do not need, since you say that the precisions that you will meet are not fixed. So, you could do that with that code (kindly assembled by this and this answers and the std::numeric limits):

#include <fstream>
#include <vector>
#include <cstdlib>
#include <iostream>
#include <limits>

typedef std::numeric_limits< double > dbl;

int main() {
    std::ifstream ifile("example.txt", std::ios::in);
    std::vector<long double> scores;

    //check to see that the file was opened correctly:
    if (!ifile.is_open()) {
        std::cerr << "There was a problem opening the input file!\n";
        exit(1);//exit or do additional error checking
    }

    long double num = 0.0;
    //keep storing values from the text file so long as data exists:
    while (ifile >> num) {
        scores.push_back(num);
    }

    std::cout.precision(dbl::max_digits10); // you can print more digits if you like, you won't the exact representation obviously
    //verify that the scores were stored correctly:
    for (int i = 0; i < scores.size(); ++i) {
        std::cout << std::fixed << scores[i] << std::endl;
    }

    return 0;
}

which gives on my machine:

C02QT2UBFVH6-lm:~ gsamaras$ cat example.txt 
3.12345678912345678912345678 2.79
C02QT2UBFVH6-lm:~ gsamaras$ g++ -Wall main.cpp
C02QT2UBFVH6-lm:~ gsamaras$ ./a.out
3.12345678912345679
2.79000000000000000

If you want to go even further, then use GMP, which is "a free library for arbitrary precision arithmetic". Of course, this will won't come for free in terms of memory usage and processing time, so you should really think twice before using it!


*I feel that you are a victim of premature optimization. If I were you, I would just store the strings and see how this is going, or better yet use a numeric data type, thus losing some precision, making your life far easier. When the project is ready see what results you are getting and if the precision achieved pleases "your boss". If not, use strings (or ).

Community
  • 1
  • 1
gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • `long double` is not guaranteed to provide higher precision or a larger range than a `double`. The last version of Visual Studio to support an 80-bit floating point representation was Visual Studio 6 SP6. Today, both floating point types use the same representation (see [Fundamental Types (C++)](https://msdn.microsoft.com/en-us/library/cc953fe1.aspx)) when using Visual Studio. – IInspectable Jul 21 '16 at 09:30
  • @IInspectable sure, in Visual studio! :) – gsamaras Jul 21 '16 at 17:36
0

The right way is to use some class represents any numeric value faithfully. To represent such values you can use the arbitrary precision integer class with scale factor. For example, it can be defined like the following:

#include <boost/multiprecision/cpp_int.hpp>

// numeric = value/pow(10, scale)
struct numeric {
  /// arbitrary precision integer
  boost::multiprecision::cpp_int value;
  /// point position
  int scale;
};
// for example, 10.1 can be represented with the numeric class exactlty in difference to double :
numeric n{101, 1}; // 101/pow(10, 1)

To print such number you can use helper function which convert objects of the numeric class to std::string:

std::string to_string(const numeric& n)  const
{
  if (n.value.is_zero() && n.scale <= 0)
    return "0";
  bool neg = n.value < 0;
  std::string r;
  {
    std::ostringstream s;
    s << (neg ? -n.value : n.value);
    s.str().swap(r);
  }
  if (n.scale > 0) {
    if (n.scale >= r.length()) {
      r = std::string(n.scale - r.length() + 1, '0') + r;
    }
    r.insert(r.length() - n.scale, 1, '.');
    if (neg)
      r = '-' + r;
  }
  else if (n.scale < 0) {
    if (neg)
      r = '-' + r;
    r += std::string(-n.scale, '0');
  }
  else {
    if (neg)
      r = '-' + r;
  }
  return std::move(r);
}

To construct the numeric object from std::string you can find point position (that is scale of the numeric), remove point and init cpp_int with the cleared from a point string.

AnatolyS
  • 4,249
  • 18
  • 28
  • *"To construct the numeric object from std::string you can ..."* - Well, no, it's certainly not **that** simple. You still have to account for +/- sign (and preserve that information), thousand's separators, or scientific notation (e.g. `101e-1`). And with all that additional information stored, the savings become so little that it's hard to justify not simply storing the strings. – IInspectable Jul 20 '16 at 22:09
  • @IInspectable: ok, but it is not so hard to implement with regex. – AnatolyS Jul 21 '16 at 02:51
  • *"it is not so hard"* - I'm going to believe you, once you come up with a regular expression, that works reliably. But this is all missing the point I was trying to make: If you want to store all information necessary to reproduce the original formatting of a floating point literal (in addition to a binary representation), it is likely more efficient to just store the string. – IInspectable Jul 21 '16 at 08:58
  • @IInspectable agreed, but if you need to have not only representation but and full algebra (add, sub, div, mul and so on) the number class instead of double is good point to start. – AnatolyS Jul 21 '16 at 09:05
  • @IInspectable also I suppose that the author talks about values like 10.1243, -1.03 (not scientist notation) which cannot be represented with double exactly, so my answer is based on my assumption. – AnatolyS Jul 21 '16 at 09:15
  • The question **explicitly** uses scientific notation. Regardless of representation, though, base-10 decimals (in general) cannot be accurately represented in a base-2 number system. – IInspectable Jul 21 '16 at 09:34