StreamTokenizer input number and parsed number are different

Question

I have a StreamTokenizer that accepts numbers. However, the parsed number is not the same as the input.

Sample code:

String str = "1000000000000.0000000000000";
double initial = 1000000000000.0000000000000;
InputStream in = new ByteArrayInputStream(str.getBytes());
StreamTokenizer input = new StreamTokenizer(new BufferedReader(new InputStreamReader(in)));
input.parseNumbers();
int n = input.nextToken();
if (n == StreamTokenizer.TT_NUMBER) {
    System.out.println("Original: " + str);
    System.out.println("Parsed: " + input.nval);
    System.out.println(initial + " == " + input.nval + " -> " + (initial == input.nval));
}

Output:

Original: 1000000000000.0000000000000
Parsed: 9.999999999999999E11
1.0E12 == 9.999999999999999E11 -> false

How can this be prevented so the two double values are equal?

EDIT: Linked question discusses why this issue appears. I am asking what are the possible ways to avoid this issue.

A cursory examination of `StreamTokenizer::nextToken` shows that it attempts to accumulate the value of all the digits sans decimal point in a `double` and then divide by a power of ten based on the number of digits to the right of the decimal point to get the correct value. Thus, "1000000000000.0000000000000" is calculated as `10000000000000000000000000. / 10000000000000`. That results in too many bits in the significand and precision is lost. The only solution I can see would be to avoid `StreamTokenizer`. — David Conrad, Aug 15 '18 at 03:29
@ElliottFrisch This question is not a duplicate of that one. — David Conrad, Aug 15 '18 at 03:30
[Appendix D: What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) — Elliott Frisch, Aug 15 '18 at 03:35
@ElliottFrisch That's all very good and well but in this case the only value being represented in floating point is an integer. The problem is different in this case. — David Conrad, Aug 15 '18 at 03:39

Stephen C · Answer 1 · 2018-08-15T07:03:21.020

As per David Conrad's analysis, the StreamTokenizer classes handling of numbers with decimal points is flawed, and there doesn't appear to be a work-around.

But this is only one of many short-comings flaws in this class. You would be better off using Scanner or String.split. If the input is really complicated, consider using a parser generator to generate a lexer / parser that precisely implements your input syntax.

Related bugs:

Having said that, applications that use floating point numbers should be tolerant of issues caused by rounding errors and imprecision. There are a number of Q&As on the best way to compare floating point numbers; e.g.

Manipulating and comparing floating points in java

score 0 · Answer 2 · edited Aug 15 '18 at 05:27

An exact way of representing numbers is usage of BigDecimal. Double has a certain precision and working with doubles of various precision (EX: double1=10000.0 and double2=0.0001) could result that the 0.0001 to be dropped. BigDecimal avoids that.

The disadvantages of BigDecimal:

it's slower
operators +, -, *, and / are not overloaded

But if you are dealing with money or having precision is a must, you should use BigDecimal, otherwise you will have losses. EX:

String str = "1.0E12";
double initial = 1000000000000.0000000000000;
BigDecimal exVal = new BigDecimal(str);
System.out.println("Original: " + str);
System.out.println("Parsed: " + exVal);
System.out.println(initial + " == " + exVal + " -> " + (initial == exVal.doubleValue()));

Program output:

Original: 1.0E12
Parsed: 1.0E+12
1.0E12 == 1.0E+12 -> true

StreamTokenizer input number and parsed number are different

2 Answers2