I am playing with pandas to_json method and cannot quite understand its behaviour.import json
import json
import pandas as pd
FILENAME = 'test.json'
df = pd.DataFrame({6.0: [1.0e3, 1.7e42, 2.0e-4, 1.7e-7],
50.0: [1.034, 1.3e-42, 1.2e17, 0.1]},
index=[75.0, 19.0, 84.0, 12.0])
df.to_json(FILENAME, double_precision=2)
with open(FILENAME, 'r') as jsonfile:
jsondata = json.load(jsonfile)
print(json.dumps(jsondata, indent=4))
This prints some numbers in fixed point, and some numbers in exponential notation.
{
"6.0": {
"75.0": 1000.0,
"19.0": 1.7e+42,
"84.0": 0.0,
"12.0": 0.0
},
"50.0": {
"75.0": 1.03,
"19.0": 1.3e-42,
"84.0": 1.2e+17,
"12.0": 0.1
}
}
In particular I cannot get the value at 6.0 - 12.0
to be printed in exponential notation. It is always saved as 0.0, while it really screws up the numerics.
Is there a way to enforce exponential notation for pd.to_json
?
Why is it treating 1.7e-7
differently from 1.3e-42
?
The boundary seems to lie between the values of exponent of e-15
and e-16
. That is 1.7e-15
would be exported as 0.0
while 1.7e-16
would be exported as 1.7e-16
. This probably has something to do with the np.float64
representation.
This is really a striking example as it shows that to_json
does not preserve monotonicity.
xf = pd.DataFrame({'a': [1.0e-15, 1.0e-16]})
xf.to_json('data.json')
with open('data.json', 'r') as jsonfile:
jsondata = json.load(jsonfile)
print(json.dumps(jsondata, indent=4))
would print
{
"a": {
"0": 0.0,
"1": 1e-16
}
}
The value in row 1 is now greater than the value in row 0.