Why would you use a string in JSON to represent a decimal number

Question

Some APIs, like the paypal API use a string type in JSON to represent a decimal number. So "7.47" instead of 7.47.

Why/when would this be a good idea over using the json number value type? AFAIK the number value type allows for infinite precision as well as scientific notation.

because using floats for currency will just cause errors down the road. floats are NOT usable for representing real world values like money - not reliably, anyways. e.g. 7.47 may actually be 7.4699999923423423423 when converted to float. a simple system that simply truncates the extra digits off will result in 7.46 and now you've lost a penny somewhere... shades of Superman II(I?). — Marc B, Feb 29 '16 at 21:06
@MarcB I'm familiar with why you wouldn't use a float for currency, but is the JSON number actually a float? As I understand it's a language independent number, and you could parse a JSON number straight into a java `BigDecmial` or other arbitrary precision format in any language if so inclined. — kag0, Feb 29 '16 at 21:28
depends on how what it was in paypal's system to begin with. json is a 1:1 mapping between a monolithic text string, and a JS data structure. if a "number" is stored as a `"..."` string in the json string, then it was a string in the original data structure, or something that maps to string. — Marc B, Feb 29 '16 at 21:44
@MarcB so you're saying the reason is based on existing systems, but there's no technical reason for that behavior in general? — kag0, Feb 29 '16 at 22:17

dthorpe · Accepted Answer · 2019-11-16T21:59:04.397

The main reason to transfer numeric values in JSON as strings is to eliminate any loss of precision or ambiguity in transfer.

It's true that the JSON spec does not specify a precision for numeric values. This does not mean that JSON numbers have infinite precision. It means that numeric precision is not specified, which means JSON implementations are free to choose whatever numeric precision is convenient to their implementation or goals. It is this variability that can be a pain if your application has specific precision requirements.

Loss of precision generally isn't apparent in the JSON encoding of the numeric value (1.7 is nice and succinct) but manifests in the JSON parsing and intermediate representations on the receiving end. A JSON parsing function would quite reasonably parse 1.7 into an IEEE double precision floating point number. However, finite length / finite precision decimal representations will always run into numbers whose decimal expansions cannot be represented as a finite sequence of digits:

Irrational numbers (like pi and e)
1.7 has a finite representation in base 10 notation, but in binary (base 2) notation, 1.7 cannot be encoded exactly. Even with a near infinite number of binary digits, you'll only get closer to 1.7, but you'll never get to 1.7 exactly.

So, parsing 1.7 into an in-memory floating point number, then printing out the number will likely return something like 1.69 - not 1.7.

Consumers of the JSON 1.7 value could use more sophisticated techniques to parse and retain the value in memory, such as using a fixed-point data type or a "string int" data type with arbitrary precision, but this will not entirely eliminate the specter of loss of precision in conversion for some numbers. And the reality is, very few JSON parsers bother with such extreme measures, as the benefits for most situations are low and the memory and CPU costs are high.

So if you are wanting to send a precise numeric value to a consumer and you don't want automatic conversion of the value into the typical internal numeric representation, your best bet is to ship the numeric value out as a string and tell the consumer exactly how that string should be processed if and when numeric operations need to be performed on it.

For example: In some JSON producers (JRuby, for one), BigInteger values automatically output to JSON as strings, largely because the range and precision of BigInteger is so much larger than the IEEE double precision float. Reducing the BigInteger value to double in order to output as a JSON numeric will often lose significant digits.

Also, the JSON spec (http://www.json.org/) explicitly states that NaNs and Infinities (INFs) are invalid for JSON numeric values. If you need to express these fringe elements, you cannot use JSON number. You have to use a string or object structure.

Finally, there is another aspect which can lead to choosing to send numeric data as strings: control of display formatting. Leading zeros and trailing zeros are insignificant to the numeric value. If you send JSON number value 2.10 or 004, after conversion to internal numeric form they will be displayed as 2.1 and 4.

If you are sending data that will be directly displayed to the user, you probably want your money figures to line up nicely on the screen, decimal aligned. One way to do that is to make the client responsible for formatting the data for display. Another way to do it is to have the server format the data for display. Simpler for the client to display stuff on screen perhaps, but this can make extracting the numeric value from the string difficult if the client also needs to make computations on the values.

Do not underestimate the significance of irrational numbers - there are a lot more irrational numbers than rational numbers! Both are infinite sets, but the measure of the irrationals is greater than the measure of the rationals. Ask your local math prof. Bring coffee. :> — dthorpe, Jul 13 '16 at 17:16
Can you give some more examples of clients that lose precision on encoding? Jackson in Java for example will happily convert a json decimal into a BigDecimal without losing any precision. It's frustrating to make one's API less accurate (it's not a string, it's a number, which is something json supports) solely because some unnamed clients are doing what I would consider the "wrong thing" when given a json number. — Michael Haefele, Sep 07 '17 at 13:22
@MichaelHaefele Example of clients that may lose precision with JSON numeric values: Every JavaScript execution environment. Every web browser. Jackson in Java sounds like a vast improvement, but for web app developers the bugaboo is JavaScript running in the web browser that processes those JSON values. — dthorpe, May 10 '18 at 19:40
Can you please elaborate on how 1.7 can be 1.69? 1.7 might internally be something like 1.699999999999999 but then it'll be print as 1.7, right? Any code example would be graet. — Kazuki, Jan 19 '21 at 20:06

score 31 · Answer 2 · edited Oct 07 '21 at 07:34

31

I'll be a bit contrarian and say that 7.47 is perfectly safe in JSON, even for financial amounts, and that "7.47" isn't any safer.

First, let me address some misconceptions from this thread:

So, parsing 1.7 into an in-memory floating point number, then printing out the number will likely return something like 1.69 - not 1.7.

That is not true, especially in the context of IEEE 754 double precision format that was mentioned in that answer. 1.7 converts into an exact double 1.6999999999999999555910790149937383830547332763671875 and when that value is "printed" for display, it will always be 1.7, and never 1.69, 1.699999999999 or 1.70000000001. It is 1.7 "exactly".

Learn more here.

7.47 may actually be 7.4699999923423423423 when converted to float

7.47 already is a float, with an exact double value 7.46999999999999975131004248396493494510650634765625. It will not be "converted" to any other float.

a simple system that simply truncates the extra digits off will result in 7.46 and now you've lost a penny somewhere

IEEE rounds, not truncates. And it would not convert to any other number than 7.47 in the first place.

is the JSON number actually a float? As I understand it's a language independent number, and you could parse a JSON number straight into a java BigDecimal or other arbitrary precision format in any language if so inclined.

It is recommended that JSON numbers are interpreted as doubles (IEEE 754 double-precision format). I haven't seen a parser that wouldn't be doing that.

And no, BigDecimal(7.47) is not the right way to do it – it will actually create a BigDecimal representing the exact double of 7.47, which is 7.46999999999999975131004248396493494510650634765625. To get the expected behavior, BigDecimal("7.47") should be used.

Overall, I don't see any fundamental issue with {"price": 7.47}. It will be converted into a double on virtually all platforms, and the semantics of IEEE 754 guarantee that it will be "printed" as 7.47 exactly and always.

Of course floating point rounding errors can happen on further calculations with that value, see e.g. 0.1 + 0.2 == 0.30000000000000004, but I don't see how strings in JSON make this better. If "7.47" arrives as a string and should be part of some calculation, it will need to be converted to some numeric data type anyway, probably float :).

It's worth noting that strings also have disadvantages, e.g., they cannot be passed to Intl.NumberFormat, they are not a "pure" data type, e.g., the dot is a formatting decision.

I'm not strongly against strings, they seem fine to me as well but I don't see anything wrong on {"price": 7.47} either.

edited Oct 07 '21 at 07:34

Community

1
1

answered Jun 07 '20 at 20:32

Borek Bernard

50,745
59
165
240

21

This is incorrect. Double != Decimal, those are two different types and if you do conversions from double to decimal you will have hard times. You've correctly mentioned json uses doubles, that's why we use strings to pass decimals. – Kamil Dziedzic Mar 19 '21 at 12:48
Nice discussion point, appreciated. I followed your link "It is recommended that JSON numbers are interpreted as doubles...", and I think I have to take issue with your interpretation. To my mind it says little more than IEEE 754 will work when it works but it will have issues when it doesn't. – Ian Apr 29 '21 at 09:12
@Ian JSON as a whole is a string so it's up to the parser / language to convert it to some local data type. You're right that there's some inherent ambiguity, for example, in MyStupidLanguage that only supports integers, the `JSON.parse` function would parse `7.47` as `7`, but I haven't seen a language in recent years that wouldn't use IEEE 754 for floating point numbers. – Borek Bernard Apr 29 '21 at 14:23
@KamilDziedzic JSON doesn't use doubles – it doesn't dictate how a number should be interpreted by the language (though it recommends IEEE 754 and "all" languages do that) so I'm not sure what is incorrect. – Borek Bernard Apr 29 '21 at 14:31
7

The problem is that IEEE 754 is just not good enough for financial applications. Which falls under the "will have issues when it doesn't" exclusion. – Ian Apr 29 '21 at 17:20
Another issue when dealing with measurements is that you may have a value and an uncertainty, and you need to preserve the original measurement exactly. So if you read 7.47 and 0.01 out of the database, how do you know if the intended interpretation was 7.47 +/- 0.01 or 7.470 +/- 0.010 ? In this case, it is safer to store as strings and avoid confusion. (Usually you avoid trailing zeros on an uncertainty, but not always.) A better example may be a number like "1000.0" What is the last significant digit? Do you know the value is 1000 +/- 100? 1000.0 +/- 0.1? – Moondoggy Jun 25 '21 at 19:42
7

I guess you are assuming the numbers won't get very big? Note that 1000000000000000.47, when interpreted as an IEEE754 double precision number, is exactly 1000000000000000.5. Some might be uncomfortable with that loss of 3 pennies. If you want to argue that it's all ok for "realistic dollar amounts", then you'd have to make a defensible decision about where to draw the line, and even if you manage to make a correct argument, it will be unnecessarily complicated. Paypal's decision to express amounts exactly (as strings) is simpler and easier to reason about, and therefore, yes, safer. – Don Hatch Sep 18 '21 at 21:38
1

Sorry, but i'll disagree. Strings are definitely "safer" because then its up to the application developer to do the conversion safely. If not, then you run the risk at be however the JSON parser stores the value in memory. The implementers are not "wrong" to decide this, they need to accommodate the most performant use if that number needs calculations done to it for the language it is in. The main problem comes from the conversion of Base 10 Float (Human storage) to a Base 2 Float (Machine storage) Try out -> https://www.h-schmidt.net/FloatConverter/IEEE754.html shows conversion errors too – Rahly Apr 21 '22 at 21:57
@DonHatch, thanks for the example numbers. I am working on a penetration test and noticed the use of JSON numbers in the API (and that they support very large numbers). To check the precision I plugged in 1000000000000000.47 for a financial transaction and the application indeed treated it equivalent to 1000000000000000.5. Like you said, perhaps they won't care, but my guess is they will. – freb Oct 27 '22 at 17:14

score 5 · Answer 3 · answered Jun 30 '20 at 08:08

5

The reason I'm doing it is that the SoftwareAG parser tries to "guess" the java type from the value it receives.

So when it receives

"jackpot":{
 "growth":200,
 "percentage":66.67
}

The first value (growth) will become a java.lang.Long and the second (percentage) will become a java.lang.Double

Now when the second object in this jackpot-array has this

"jackpot":{
 "growth":50.50,
 "percentage":65
}

I have a problem.

When I exchange these values as Strings, I have complete control and can cast/convert the values to whatever I want.

answered Jun 30 '20 at 08:08

Niek

989
2
11
23

2

Interesting. It's essentially the same issue as the selected answer (json consumer has poor control over parsing/marshalling), but manifesting in a different way. – kag0 Jul 13 '20 at 20:45
Why not just use `((Number)value).doubleValue()`... ? – Hans Brende Mar 25 '22 at 16:53
Like stated I have no control over the parser. I could "reset" everything by toString() -> toDouble(), but then it's just as easy to start with strings anyway – Niek Mar 30 '22 at 13:26

score 4 · Answer 4 · answered Jan 02 '18 at 22:57

4

Summarized Version

Just quoting from @dthorpe's answer, as I think this is the most important point:

Also, the JSON spec (http://www.json.org/) explicitly states that NaNs and Infinities (INFs) are invalid for JSON numeric values. If you need to express these fringe elements, you cannot use JSON number. You have to use a string or object structure.

answered Jan 02 '18 at 22:57

Ole

41,793
59
191
359

3

I'd say this is the most important reason from a technical perspective. But from a practical perspective it sounds more like enough languages or libraries just lack the control to keep precision when it's needed. – kag0 Mar 28 '18 at 22:25

score 0 · Answer 5 · answered Nov 02 '22 at 13:04

I18N is another reason NOT to use String for decimal numbers

In tens of countries, such as Germany and France, comma (,) is the decimal separator and dot (.) is the thousands separator. See the list on Wikipedia.

If your JSON document carries decimal numbers as string, you're relying on all possible API consumers using the same number format conversion (which is a step after the JSON parsing). There's the risk of incorrect conversion due to inverted use of comma and dot as separators.

If you use number for decimal numbers that risk is averted.

score 0 · Answer 6 · answered Jul 01 '23 at 20:32

Opinion

Precision: usually a red herring
Ambiguity: usually a red herring

See number at https://www.json.org/json-en.html and use serializers that obey the rules

Only if values might challenge the limits of IEEE 754 double precision (about 15 significant digits), then specify string because you know that you are not working with "A number very much like a C or Java number" and make this explicit when publishing your API. I would always expect a double that is "almost exactly 1.7" to convert to decimal 1.7 exactly, without jumping through extra hoops (only exception when values may have >~ 15 significant digits).

Formatting:

serve up values (that either side of the data exchange may use in calculations) as numbers
serve up information for display (one side formats for the other) as strings ("2023-07-01", "$ 1,234.57") and do not mix the two purposes
(a client receiving a number can - obviously - format as it chooses)

The PayPal example:
a decently written client shouldn't (have to) care (C# Convert.ToDouble(jsonValue) /* or ToDecimal */ handles "7.47" and 7.47 just the same).

Finally 1: While I prefer to "think JSON number" when I am producing the values, I wouldn't care to dictate what anyone else should do. I have recently had to interface with a mainframe COBOL system using IBM CICS Web Services and get numbers as strings with decimal commas instead of points "1234,57". Two options: tell the service owner to change the service to suit my religious view of how it should look or adapt to use what I now know about how that system works... the choice is obvious.

Finally 2: Know and specify up-front what to do when non-numeric results arise (what should you do when a calculation produces an INF or NaN). If you really, really need to pass "+INF" as a valid result, then I'd say you have a case for using strings - again having to explicitly specify behaviour (lest your code breaks or you break normal deserialization at the other end - there is no agreed standard here and you can't assume that "INF" will deserialize to a double/decimal containing +INF).

Why would you use a string in JSON to represent a decimal number

6 Answers6

Summarized Version

I18N is another reason NOT to use String for decimal numbers

Linked