C++ set precision of a double (not for output)

Question

Alright so I am trying to truncate actual values from a double with a given number of digits precision (total digits before and after, or without, decimal), not just output them, not just round them. The only built in functions I found for this truncates all decimals, or rounds to given decimal precision. Other solutions I have found online, can only do it when you know the number of digits before the decimal, or the entire number. This solution should be dynamic enough to handle any number. I whipped up some code that does the trick below, however I can't shake the feeling there is a better way to do it. Does anyone know of something more elegant? Maybe a built in function that I don't know about?

I should mention the reason for this. There are 3 different sources of observed values. All 3 of these sources agree to some level in precision. Such as below, they all agree within 10 digits. 4659.96751751236 4659.96751721355 4659.96751764253 However I need to only pull from 1 of the sources. So the best approach, is to only use up to the precision all 3 sources agree on. So its not like I am manipulating numbers and then need to truncate precision, they are observed values. The desired result is 4659.967517

double truncate(double num, int digits) {

// check valid digits
if (digits < 0)
    return num;

// create string stream for full precision (string conversion rounds at 10)
ostringstream numO;

// read in number to stream, at 17+ precision things get wonky
numO << setprecision(16) << num;

// convert to string, for character manipulation
string numS = numO.str();

// check if we have a decimal
int decimalIndex = numS.find('.');

// if we have a decimal, erase it for now, logging its position
if(decimalIndex != -1)
    numS.erase(decimalIndex, 1);

// make sure our target precision is not higher than current precision
digits = min((int)numS.size(), digits);

// replace unwanted precision with zeroes
numS.replace(digits, numS.size() - digits, numS.size() - digits, '0');

// if we had a decimal, add it back
if (decimalIndex != -1)
    numS.insert(numS.begin() + decimalIndex, '.');

return atof(numS.c_str());

}

if you know how to round/truncate the first digit after the decimal point and you know how to multiply/divide by 10 then you can round/truncate any digit you like — 463035818_is_not_an_ai, Sep 26 '17 at 12:10
`(num * pow(10, n)) / pow(10, n)` will do what you want but some numbers can't be expressed exactly so you could still end up with more values. What is the actual purpose for this? Almost sounds like you want a fixed point library. — NathanOliver, Sep 26 '17 at 12:13
Floating point numbers don't really work like that. I would consider using a Decimal type with 2 ints, one being the mantissa and the other the exponent. Manipulating these later to whatever precision you want is very simple: Divide mantissa by pow(10,n), add n to exponent — Eyal K., Sep 26 '17 at 12:16
@tobi303 ya I thought about taking an iterative approach of dividing or multiply by 10, but I think strings might be quicker. Have to test it XD — , Sep 26 '17 at 12:27
@EyalK. That sounds cool =) My only question is, what would the advantage be over the code I presented? As in order to get the mantissa, I would have to manipulate the double to move the decimal anyways. And then I would have to store two variables that represent one. All in all it sounds like extra lines of code to accomplish the same thing. I may be missing something though.. — , Sep 26 '17 at 23:49
@NathanOliver what is a fixed point library? Might be a good a solution! — , Sep 26 '17 at 23:51
@TimJohnsen Take a look at [this](https://stackoverflow.com/questions/79677/whats-the-best-way-to-do-fixed-point-math) — NathanOliver, Sep 27 '17 at 11:31
@TimJohnsen A decimal type would allow you to do accurate mathematical operations on the numbers, something that `double` fails at. `double` also cannot be rounded to arbitrary precision in any meaningful way — Eyal K., Sep 27 '17 at 13:19

score 4 · Answer 1 · answered Sep 26 '17 at 12:14

4

This will never work since a double is not a decimal type. Truncating what you think are a certain number of decimal digits will merely introduce a new set of joke digits at the end. It could even be pernicious: e.g. 0.125 is an exact double, but neither 0.12 nor 0.13 are.

If you want to work in decimals, then use a decimal type, or a large integral type with a convention that part of it holds a decimal portion.

answered Sep 26 '17 at 12:14

Bathsheba

231,907
34
361
483

That's why I am truncating at a certain point. I know all the numbers are accurate up to a given precision. After that the numbers are random. So I want to zero them out, and work with the proper Sig Figs - up to the known precision. – Sep 26 '17 at 12:20
"After that the numbers are random." every time I read that I die a little inside. Use an integral or decimal type; pretty please, with sugar on top. – Bathsheba Sep 26 '17 at 12:20
I have looked a little into decimal types, but the problem is they only look at precision after the decimal place. I need a total precision of all digits before and after decimal place. The number of digits before the decimal place also varies so I can not hard code it. – Sep 26 '17 at 12:25
A quid says that a `long long` is adequate? – Bathsheba Sep 26 '17 at 12:25
I see what you mean with .125 being better than .12 and .13 But that is if you know .125 is accurate. If you do not know the 5 is accurate, it is better to truncate and use .12 – Sep 26 '17 at 13:08
@TimJohnsen Your truncating only works if you do it in binary. `1.2` cannot be represented in binary float or even double. It is as impossible as correctly representing `2/3` with finite number of decimal places. `1.25` can be represented in binary float (1+0*1/2+1*1/4). That is like representing `3/5` in with decimal places: `0.6==0+6/10`. – Yunnosch Sep 26 '17 at 13:12
@Yunnosch These are observed decimal values from irrational numbers, not fractions. There is a loss in accuracy with increasing precision. Think about an irrational number like pi, but only you know the proper digits up say 10 precision. – Sep 26 '17 at 23:45

Patricia Shanahan · Answer 2 · 2017-09-26T21:23:00.917

0

I disagree with "So the best approach, is to only use up to the precision all 3 sources agree on."

If these are different measurements of a physical quantity, or represent rounding error due to different ways of calculating from measurements, you will get a better estimate of the true value by taking their mean than by forcing the digits they disagree about to any arbitrary value, including zero.

The ultimate justification for taking the mean is the Central Limit Theorem, which suggests treating your measurements as a sample from a normal distribution. If so, the sample mean is the best available estimate of the population mean. Your truncation process will tend to underestimate the actual value.

It is generally better to keep every scrap of information you have through the calculations, and then remember you have limited precision when outputting results.

As well as giving a better estimate, taking the mean of three numbers is an extremely simple calculation.

edited Sep 26 '17 at 21:23

answered Sep 26 '17 at 14:15

Patricia Shanahan

25,849
4
38
75

Yes I agree the average is the best. However I am required to only use one of the sources, as I stated in the question. This is to keep the program and resources as light weight as possible. Since they do not agree after a certain precision, the best route is to use Significant figure. So take the best precision, and truncate to the accurate significant figures. – Sep 26 '17 at 23:41
Even a random choice of the three sources would be a better estimator than forcing bits, or worse decimal digits, to zero. – Patricia Shanahan Sep 26 '17 at 23:45
I see what you are saying, however I am using Significant Figures. Meaning that I am stating to the reader I am not sure about the accuracy of these digits after a certain point. So the final result, will be displayed only showing the digits I am confident are correct. I used the example of Pi in another comment, if you only know the precision up to a certain point, you use that and truncate digits after that. It isn't replacing the numbers with 0, it is refusing to state the numbers as they can be erroneous. – Sep 27 '17 at 00:06
For example, take the number 4659.96751751236. The accuracy after 10 digits is iffy at best. Say the math operation is simply multiply by 5. Taking the min and max of the digits after that... 4659.96751799999 * 5 = 23299.83758999995 4659.96751700000 * 5 = 23299.83758500000 So the two outputs also agree up to 10 digit precision. Thus truncating at 10 precision for the math operations and results, yields the most accurate results. – Sep 27 '17 at 00:39
There is a difference between what you display, and what you use in calculations. During calculations, you should use the best estimator of the real value that you can get your hands on. Mean of three observations would be better than a single observation, but a single observation is a better estimator than anything you can get by modifying it. When you come to display, "10" becomes a number that matters and it makes sense to truncate the output to the digits you trust, taking into account both original measurement error and accumulated rounding error. – Patricia Shanahan Sep 27 '17 at 00:47
Right, however if you use the example that I posted, you can obtain the most accurate precision size. Of course this changes from one math operation to the next - along with how min(s) and max(s) are calculated. The example shows that it does not matter what the numbers are after a certain point, when using a given known precision. So truncation works here. Which gets me back to my original question of how to do it best. – Sep 27 '17 at 00:53

C++ set precision of a double (not for output)

2 Answers2