I'm serialising some data using 'Avro' schema, the code is written in Python and I'm facing precision lost. Looks like Python is rounding the numbers and adding the scientific notation to it.
What I see: 1.2345678901234568e+16
What I expect to see: 12345678901234567.19
The code example is below.
Reproducible code sample:
from fastavro import writer, reader, parse_schema
schema = {
'doc': 'A weather reading.',
'name': 'Weather',
'namespace': 'test',
'type': 'record',
'fields': [
{'name': 'station', 'type': 'string'},
{'name': 'time', 'type': 'double'},
{'name': 'temp', 'type': 'double'},
],
}
parsed_schema = parse_schema(schema)
# 'records' can be an iterable (including generator)
records = [
{u'station': u'011990-99999', u'temp': 0, u'time': 1433269388},
{u'station': u'011990-99999', u'temp': -11, u'time': 12345678901234567.19},
{u'station': u'012650-99999', u'temp': 111, u'time': 1433275478},
]
# Writing
with open('weather.avro', 'wb') as out:
writer(out, parsed_schema, records)
# Reading
with open('weather.avro', 'rb') as fo:
for record in reader(fo):
print(record)
I believe there might be a way to (override) write my own deserialiser which would give me the control on how a double is deserialized into a string.
Any ideas?