Format floats with standard json module

Question

I am using the standard json module in python 2.6 to serialize a list of floats. However, I'm getting results like this:

>>> import json
>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I want the floats to be formated with only two decimal digits. The output should look like this:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

I have tried defining my own JSON Encoder class:

class MyEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            return format(obj, '.2f')
        return json.JSONEncoder.encode(self, obj)

This works for a sole float object:

>>> json.dumps(23.67, cls=MyEncoder)
'23.67'

But fails for nested objects:

>>> json.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'

I don't want to have external dependencies, so I prefer to stick with the standard json module.

How can I achieve this?

score 91 · Accepted Answer · edited Jun 14 '23 at 11:53

91

NOTE: This does not work in any recent version of Python.

Unfortunately, I believe you have to do this by monkey-patching (which, to my opinion, indicates a design defect in the standard library json package). E.g., this code:

import json
from json import encoder
encoder.FLOAT_REPR = lambda o: format(o, '.2f')
    
print(json.dumps(23.67))
print(json.dumps([23.67, 23.97, 23.87]))

emits:

23.67
[23.67, 23.97, 23.87]

as you desire. Obviously, there should be an architected way to override FLOAT_REPR so that EVERY representation of a float is under your control if you wish it to be; but unfortunately that's not how the json package was designed.

edited Jun 14 '23 at 11:53

Mateen Ulhaq

24,552
19
101
135

answered Sep 19 '09 at 02:48

Alex Martelli

854,459
170
1,222
1,395

12

This solution does not work in Python 2.7 using Python's C version of the JSON encoder. – Nelson Apr 06 '11 at 23:14
28

However you do this, use something like %.15g or %.12g instead of %.3f . – Guido van Rossum Mar 12 '13 at 21:04
27

I found this snippet in a junior programmer's code. This would have created a very serious but subtle bug if it had not been caught. Can you please place a warning on this code explaining the global implications of this monkey patching. – Rory Hart Apr 25 '13 at 03:13
13

It's good hygiene to set it back when you're done: `original_float_repr = encoder.FLOAT_REPR` `encoder.FLOAT_REPR = lambda o: format(o, '.2f')` `print json.dumps(1.0001)` `encoder.FLOAT_REPR = original_float_repr` – Jeff Kaufman Oct 18 '13 at 17:05
1

Incidentally, `format(.3, '.2f')` seems to be an order of magnitude slower than `"%.2f" % .3`. – Aryeh Leib Taurog Mar 06 '14 at 22:12
1

Does not work for me on standard Python 2.7.3 on win64. The first list is formatted normally, but after that the behavior flips to the default and all consecutive lists are formatted with the default behavior. – Jan Šimbera Jan 01 '15 at 19:08
3

Please note that the `original_float_repr` solution is not threadsafe. – konrad Apr 25 '16 at 12:27
10

As others have pointed out, this is no longer working in at least Python 3.6+. Add a few digits to `23.67` to see how `.2f` is not respected. – Nico Schlömer Jun 05 '18 at 13:18
Yes, this is a quite silly example... why use 2 digit rounding if you're only feeding it two significant-digit numbers? – naught101 Mar 15 '19 at 06:18
Unfortunately, this solution no longer works on the latest version Python 2 and 3 – xuancong84 Jun 30 '20 at 10:00
2

This is not working in python 3.8. The formatting is ignored. 0.3 and 0.34566 are printed this way, instead of with 2 decimal places – franksands Jun 15 '23 at 13:55

score 63 · Answer 2 · edited Jul 23 '20 at 22:10

63

import simplejson
    
class PrettyFloat(float):
    def __repr__(self):
        return '%.15g' % self
    
def pretty_floats(obj):
    if isinstance(obj, float):
        return PrettyFloat(obj)
    elif isinstance(obj, dict):
        return dict((k, pretty_floats(v)) for k, v in obj.items())
    elif isinstance(obj, (list, tuple)):
        return list(map(pretty_floats, obj))
    return obj
    
print(simplejson.dumps(pretty_floats([23.67, 23.97, 23.87])))

emits

[23.67, 23.97, 23.87]

No monkeypatching necessary.

edited Jul 23 '20 at 22:10

Nico Schlömer

53,797
27
201
249

answered Nov 14 '09 at 03:16

Tom Wuttke

868
6
4

2

I like this solution; better integration, and works with 2.7. Because I am building up the data myself anyway, I eliminated the `pretty_floats` function and simply integrated it into my other code. – mikepurvis Feb 22 '12 at 21:25
1

In Python3 it gives **"Map object is not JSON serializable"** error, but you can resolve converting the map() to a list with `list( map(pretty_floats, obj) )` – Guglie Oct 11 '18 at 23:54
1

@Guglie: that's because in Python 3 `map` returns iterator, not a `list` – Azat Ibrakov Oct 30 '18 at 13:27
9

Doesn't work for me (Python 3.5.2, simplejson 3.16.0). Tried it with %.6g and [23.671234556, 23.971234556, 23.871234556], it still prints the whole number. – szali May 16 '19 at 09:53
Can you explain why simplejson is required here? Is this answer just so old that you weren't using Python 2.6? – supermitch May 31 '23 at 16:04

score 35 · Answer 3 · answered Mar 15 '15 at 21:30

35

Really unfortunate that dumps doesn't allow you to do anything to floats. However loads does. So if you don't mind the extra CPU load, you could throw it through the encoder/decoder/encoder and get the right result:

>>> json.dumps(json.loads(json.dumps([.333333333333, .432432]), parse_float=lambda x: round(float(x), 3)))
'[0.333, 0.432]'

answered Mar 15 '15 at 21:30

Claude

8,806
4
41
56

3

The simplest suggestion here that also works in 3.6. – Brent Faust Mar 27 '18 at 02:13
Note the phrase "don't mind the extra CPU load". Definitely do not use this solution if you have a lot of data to serialize. For me, adding this alone made a program doing a non-trivial calculation take 3X longer. – shaneb Jun 29 '18 at 16:16
1

This does not work if you need a precision of 6 decimals or so. – Daniel F Feb 11 '21 at 22:44

score 29 · Answer 4 · answered Apr 06 '11 at 23:29

If you're using Python 2.7, a simple solution is to simply round your floats explicitly to the desired precision.

>>> sys.version
'2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)]'
>>> json.dumps(1.0/3.0)
'0.3333333333333333'
>>> json.dumps(round(1.0/3.0, 2))
'0.33'

This works because Python 2.7 made float rounding more consistent. Unfortunately this does not work in Python 2.6:

>>> sys.version
'2.6.6 (r266:84292, Dec 27 2010, 00:02:40) \n[GCC 4.4.5]'
>>> json.dumps(round(1.0/3.0, 2))
'0.33000000000000002'

The solutions mentioned above are workarounds for 2.6, but none are entirely adequate. Monkey patching json.encoder.FLOAT_REPR does not work if your Python runtime uses a C version of the JSON module. The PrettyFloat class in Tom Wuttke's answer works, but only if %g encoding works globally for your application. The %.15g is a bit magic, it works because float precision is 17 significant digits and %g does not print trailing zeroes.

I spent some time trying to make a PrettyFloat that allowed customization of precision for each number. Ie, a syntax like

>>> json.dumps(PrettyFloat(1.0 / 3.0, 4))
'0.3333'

It's not easy to get this right. Inheriting from float is awkward. Inheriting from Object and using a JSONEncoder subclass with its own default() method should work, except the json module seems to assume all custom types should be serialized as strings. Ie: you end up with the Javascript string "0.33" in the output, not the number 0.33. There may be a way yet to make this work, but it's harder than it looks.

Another approach for Python 2.6 using JSONEncoder.iterencode and pattern matching can be seen at https://github.com/migurski/LilJSON/blob/master/liljson.py — Nelson, Nov 21 '12 at 18:35
Hopefully this makes passing around your floats more lightweight - I like how we can avoid messing with the JSON classes which can suck. — Lincoln B, Dec 25 '12 at 02:11

score 21 · Answer 5 · answered Dec 16 '18 at 01:15

Here's a solution that worked for me in Python 3 and does not require monkey patching:

import json

def round_floats(o):
    if isinstance(o, float): return round(o, 2)
    if isinstance(o, dict): return {k: round_floats(v) for k, v in o.items()}
    if isinstance(o, (list, tuple)): return [round_floats(x) for x in o]
    return o


json.dumps(round_floats([23.63437, 23.93437, 23.842347]))

Output is:

[23.63, 23.93, 23.84]

It copies the data but with rounded floats.

score 9 · Answer 6 · answered Oct 29 '09 at 19:15

If you're stuck with Python 2.5 or earlier versions: The monkey-patch trick does not seem to work with the original simplejson module if the C speedups are installed:

$ python
Python 2.5.4 (r254:67916, Jan 20 2009, 11:06:13) 
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import simplejson
>>> simplejson.__version__
'2.0.9'
>>> simplejson._speedups
<module 'simplejson._speedups' from '/home/carlos/.python-eggs/simplejson-2.0.9-py2.5-linux-i686.egg-tmp/simplejson/_speedups.so'>
>>> simplejson.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.670000000000002, 23.969999999999999, 23.870000000000001]'
>>> simplejson.encoder.c_make_encoder = None
>>> simplejson.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'
>>>

score 8 · Answer 7 · answered Sep 19 '09 at 02:40

8

You can do what you need to do, but it isn't documented:

>>> import json
>>> json.encoder.FLOAT_REPR = lambda f: ("%.2f" % f)
>>> json.dumps([23.67, 23.97, 23.87])
'[23.67, 23.97, 23.87]'

answered Sep 19 '09 at 02:40

Ned Batchelder

364,293
75
561
662

7

Looks neat, but seems to not work on Python 3.6. In particular, I didn't see a `FLOAT_REPR` constant in the `json.encoder` module. – Tomasz Gandor Jan 23 '19 at 11:54
For a simple monkey-patch solution that works with Python 3, see my answer at https://stackoverflow.com/questions/54370322/ – proski Sep 04 '21 at 14:55

score 8 · Answer 8 · answered Aug 08 '20 at 18:30

Using numpy

If you actually have really long floats you can round them up/down correctly with numpy:

import json 

import numpy as np

data = np.array([23.671234, 23.97432, 23.870123])

json.dumps(np.around(data, decimals=2).tolist())

'[23.67, 23.97, 23.87]'

Nico Schlömer · Answer 9 · 2020-08-01T14:05:06.567

5

I just released fjson, a small Python library to fix this issue. Install with

pip install fjson

and use just like json, with the addition of the float_format parameter:

import math
import fjson


data = {"a": 1, "b": math.pi}
print(fjson.dumps(data, float_format=".6e", indent=2))

{
  "a": 1,
  "b": 3.141593e+00
}

edited Aug 01 '20 at 14:05

answered Jul 25 '20 at 22:12

Nico Schlömer

53,797
27
201
249

score 3 · Answer 10 · answered Jul 20 '11 at 23:35

Alex Martelli's solution will work for single threaded apps, but may not work for multi-threaded apps that need to control the number of decimal places per thread. Here is a solution that should work in multi threaded apps:

import threading
from json import encoder

def FLOAT_REPR(f):
    """
    Serialize a float to a string, with a given number of digits
    """
    decimal_places = getattr(encoder.thread_local, 'decimal_places', 0)
    format_str = '%%.%df' % decimal_places
    return format_str % f

encoder.thread_local = threading.local()
encoder.FLOAT_REPR = FLOAT_REPR     

#As an example, call like this:
import json

encoder.thread_local.decimal_places = 1
json.dumps([1.56, 1.54]) #Should result in '[1.6, 1.5]'

You can merely set encoder.thread_local.decimal_places to the number of decimal places you want, and the next call to json.dumps() in that thread will use that number of decimal places

Mike Fogel · Answer 11 · 2013-06-07T01:16:06.777

If you need to do this in python 2.7 without overriding the global json.encoder.FLOAT_REPR, here's one way.

import json
import math

class MyEncoder(json.JSONEncoder):
    "JSON encoder that renders floats to two decimal places"

    FLOAT_FRMT = '{0:.2f}'

    def floatstr(self, obj):
        return self.FLOAT_FRMT.format(obj)

    def _iterencode(self, obj, markers=None):
        # stl JSON lame override #1
        new_obj = obj
        if isinstance(obj, float):
            if not math.isnan(obj) and not math.isinf(obj):
                new_obj = self.floatstr(obj)
        return super(MyEncoder, self)._iterencode(new_obj, markers=markers)

    def _iterencode_dict(self, dct, markers=None):
        # stl JSON lame override #2
        new_dct = {}
        for key, value in dct.iteritems():
            if isinstance(key, float):
                if not math.isnan(key) and not math.isinf(key):
                    key = self.floatstr(key)
            new_dct[key] = value
        return super(MyEncoder, self)._iterencode_dict(new_dct, markers=markers)

Then, in python 2.7:

>>> from tmp import MyEncoder
>>> enc = MyEncoder()
>>> enc.encode([23.67, 23.98, 23.87])
'[23.67, 23.98, 23.87]'

In python 2.6, it doesn't quite work as Matthew Schinckel points out below:

>>> import MyEncoder
>>> enc = MyEncoder()  
>>> enc.encode([23.67, 23.97, 23.87])
'["23.67", "23.97", "23.87"]'

Those look like strings, not numbers. – Matthew Schinckel Jan 14 '12 at 12:57 — Matthew Schinckel, Jan 14 '12 at 12:57

Matt · Answer 12 · 2022-06-27T14:39:26.693

New Answer:

Inspired by this answer, it looks scary but actually works perfectly

import json

class RoundingFloat(float):
    __repr__ = staticmethod(lambda x: format(x, '.2f'))

json.encoder.c_make_encoder = None

json.encoder.float = RoundingFloat

print(json.dumps({'number': 1.0 / 81}))

Old Answer Below:

I am amazed / bemused that this is not a feature, fortunately, TensorFlow authors have already solved this problem by using regex:

import json
import re

def FormatFloat(json_str, float_digits):
  pattern = re.compile(r'\d+\.\d+')
  float_repr = '{:.' + '{}'.format(float_digits) + 'f}'

  def MRound(match):
    return float_repr.format(float(match.group()))

  return re.sub(pattern, MRound, json_str)

def Dumps(obj, float_digits=-1, **params):
  """Wrapper of json.dumps that allows specifying the float precision used.

  Args:
    obj: The object to dump.
    float_digits: The number of digits of precision when writing floats out.
    **params: Additional parameters to pass to json.dumps.

  Returns:
    output: JSON string representation of obj.
  """
  json_str = json.dumps(obj, **params)

  if float_digits > -1:
    json_str = FormatFloat(json_str, float_digits)

  return json_str

This works by just wrapping json.dumps from the standard package then running a regex on the result.

I was tempted to do something like this, but this solution has a MAJOR drawback: it will affect everything that _looks like a float_, even if it is in the middle of a string or, even worse, in a key of the JSON representation of the data. It could be used only for very specific kinds of data when you are 100% sure you won't find anything looking like a float anywhere else. — Victor Schröder, Dec 30 '21 at 00:55
I would tend to agree, I like this solution in the end https://stackoverflow.com/a/69056325/5125264 — Matt, Jan 04 '22 at 16:06
In fact, the solution mentioned by @Matt above is what I ended up using, but that solution has another major drawback, because it applies the format globally. In case we need different precision for different parts of the JSON serialized string, that won't work... — Victor Schröder, Jan 04 '22 at 19:23

score 1 · Answer 13 · answered Apr 30 '14 at 13:51

When importing the standard json module, it is enough to change the default encoder FLOAT_REPR. There isn't really the need to import or create Encoder instances.

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.2f')

json.dumps([23.67, 23.97, 23.87]) #returns  '[23.67, 23.97, 23.87]'

Sometimes is also very useful to output as json the best representation python can guess with str. This will make sure signifficant digits are not ignored.

import json
json.dumps([23.67, 23.9779, 23.87489])
# output is'[23.670000000000002, 23.977900000000002, 23.874890000000001]'

json.encoder.FLOAT_REPR = str
json.dumps([23.67, 23.9779, 23.87489])
# output is '[23.67, 23.9779, 23.87489]'

score 1 · Answer 14 · answered Apr 08 '17 at 14:54

I agree with @Nelson that inheriting from float is awkward, but perhaps a solution that only touches the __repr__ function might be forgiveable. I ended up using the decimal package for this to reformat floats when needed. The upside is that this works in all contexts where repr() is being called, so also when simply printing lists to stdout for example. Also, the precision is runtime configurable, after the data has been created. Downside is of course that your data needs to be converted to this special float class (as unfortunately you cannot seem to monkey patch float.__repr__). For that I provide a brief conversion function.

The code:

import decimal
C = decimal.getcontext()

class decimal_formatted_float(float):
   def __repr__(self):
       s = str(C.create_decimal_from_float(self))
       if '.' in s: s = s.rstrip('0')
       return s

def convert_to_dff(elem):
    try:
        return elem.__class__(map(convert_to_dff, elem))
    except:
        if isinstance(elem, float):
            return decimal_formatted_float(elem)
        else:
            return elem

Usage example:

>>> import json
>>> li = [(1.2345,),(7.890123,4.567,890,890.)]
>>>
>>> decimal.getcontext().prec = 15
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.2345,), (7.890123, 4.567, 890, 890)]
>>> json.dumps(dff_li)
'[[1.2345], [7.890123, 4.567, 890, 890]]'
>>>
>>> decimal.getcontext().prec = 3
>>> dff_li = convert_to_dff(li)
>>> dff_li
[(1.23,), (7.89, 4.57, 890, 890)]
>>> json.dumps(dff_li)
'[[1.23], [7.89, 4.57, 890, 890]]'

This doesn't work with the built-in Python3 json package, which doesn't use __repr__(). — Ian Goldby, Jul 23 '19 at 07:51

score 1 · Answer 15 · answered Aug 30 '22 at 02:23

To achieve the fixed precision float output in the .json file, one way is to make changes in the encoder.py module in python_dir\lib\json module.

I first created a class:

        class FloatRepr(reprlib.Repr):
            def repr_float(self,value,level):
                return format(value,'.2f')

Then, modify the floatstr function to the following:

        def floatstr(o, allow_nan=self.allow_nan, _repr=float.__repr__, _inf=INFINITY,_neginf=-INFINITY):

        if o != o:
            text = 'NaN'
        elif o == _inf:
            text = 'Infinity'
        elif o == _neginf:
            text = '-Infinity'
        else:
            # return _repr(o) # commented out
            return FloatRepr().repr(o) # changes made

        if not allow_nan:
            raise ValueError(
                "Out of range float values are not JSON compliant: " +
                repr(o))

        return text

By doing this, the .json float values will not be a string.

score 0 · Answer 16 · answered Jan 13 '21 at 13:01

I did that :) Beware that with my code you will always have 2 digit's after comma

>>> json_dumps_with_two_digit_float({'a': 1.0})
'{"a": 1.00}'

My custom function:

from unittest.mock import patch
import json
# We need to ensure that c encoder will not be launched
@patch('json.encoder.c_make_encoder', None)
def json_dumps_with_two_digit_float(some_object):
    # saving original method
    of = json.encoder._make_iterencode
    def inner(*args, **kwargs):
        args = list(args)
        # fifth argument is float formater which will we replace
        args[4] = lambda o: '{:.2f}'.format(o)
        return of(*args, **kwargs)
    
    with patch('json.encoder._make_iterencode', wraps=inner):
        return json.dumps(some_object)

Don't forget to create some tests in your project, because my func heavily related to python json module implementation which can be changed in the future.

mirekphd · Answer 17 · 2023-03-31T10:59:43.237

Nearly a decade and a half has passed and tools have improved immensely, so there are several off-the-shelf custom rounding functions now. One of the most versatile and memory efficient comes from the popular numpy package: numpy.format_float_positional:

problem_data_dict = \
{0: 0,
 1: 0.1,
 2: 0.01,
 3: 0.001,
 4: 0.0001,
 5: 0.00001,
 6: 0.000001,
 7: 0.0000001,
 8: 0.00000001,
 9: 5.6e-05,
 10: 9.8e-05,
 11: 8e-05,
 12: 3e-05,
 15: 5.9e-05,
 16: 0.9e-06,
 17: 7.1e-04,
 18: 3.6e-05,
 19: 6.234e-03,
 20: 5.42e-06,
 21: 8.12e-05}

import numpy as np

max_precision = 6

json.dumps({k:np.format_float_positional(v, 
                                         unique=True,
                                         # trim="-",
                                         # trim=".",
                                         trim="0",
                                         precision=max_precision) 
            for k,v in problem_data_dict.items()})


'{"0": "0.0", "1": "0.1", "2": "0.01", "3": "0.001", "4": "0.0001", "5": "0.00001", "6": "0.000001", "7": "0.0", "8": "0.0", "9": "0.000056", "10": "0.000098", "11": "0.00008", "12": "0.00003", "15": "0.000059", "16": "0.000001", "17": "0.00071", "18": "0.000036", "19": "0.006234", "20": "0.000005", "21": "0.000081"}'

Sam Watkins · Answer 18 · 2012-03-21T06:04:26.043

Pros:

Works with any JSON encoder, or even python's repr.
Short(ish), seems to work.

Cons:

Ugly regexp hack, barely tested.

Quadratic complexity.

def fix_floats(json, decimals=2, quote='"'):
    pattern = r'^((?:(?:"(?:\\.|[^\\"])*?")|[^"])*?)(-?\d+\.\d{'+str(decimals)+'}\d+)'
    pattern = re.sub('"', quote, pattern) 
    fmt = "%%.%df" % decimals
    n = 1
    while n:
        json, n = re.subn(pattern, lambda m: m.group(1)+(fmt % float(m.group(2)).rstrip('0')), json)
    return json

Format floats with standard json module

18 Answers18

Using numpy

Linked

Related