1

There are a few existing questions about float formatting, but none answer the following question, I think.

I'm looking for a way to print large floats in a long, nicely rounded and localized format:

>>> print magic_format(1.234e22, locale="en_US")
12,340,000,000,000,000,000,000
>>> print magic_format(1.234e22, locale="fr_FR")
12 340 000 000 000 000 000 000

Unfortunately, magic_format does not exist. ;-) How can I implement it?

Details

Here are a few ways to print floats. None of them produces the above output:

>>> x = 1.234e22
>>> print str(x)
1.234e+22
>>> print repr(x)
1.234e+22
>>> print "%f" % x
12339999999999998951424.000000
>>> print "%g" % x
1.234e+22

Fail: either I get the short version, or a non-grouping non-localized non-rounded output.

BTW, I understand that 1.234e22 cannot be stored exactly as a float, there's a necessary rounding error (that explains the odd output above). But since str, repr and "%g" % x are able to properly round that to the proper value, I would like to have the same friendly rounded number, but in a long and localized form.

Let's try localizing now...

>>> import locale
>>> locale.setlocale(locale.LC_ALL, "en_US")
'en_US'
>>> locale.format("%g", x, grouping = True)
'1.234e+22'
>>> locale.format("%f", x, grouping = True)
'12,339,999,999,999,998,951,424.000000'
>>> locale.setlocale(locale.LC_ALL, "fr_FR")
'fr_FR'
>>> locale.format("%g", x, grouping = True)
'1,234e+22'
>>> locale.format("%f", x, grouping = True)
'12339999999999998951424,000000'

Closer, but not ok. I still have the annoying rounding error, and the French localization sucks, it does not allow grouping at all.

So let's use the excellent Babel library, perhaps it can do everything I want:

>>> from babel.numbers import format_number
>>> format_number(x, locale = "en_US")
u'12,339,999,999,999,998,951,424'
>>> format_number(x, locale = "fr_FR")
u'12\xa0339\xa0999\xa0999\xa0999\xa0998\xa0951\xa0424'

Wow, really close. They even use non-breakable spaces for grouping in French, I love it. It's really too bad they still have the rounding issue.

Hey!? What if I used python Decimals?

>>> from decimal import Decimal
>>> Decimal(x)
Decimal('12339999999999998951424')
>>> Decimal("%g" % x)
Decimal('1.234E+22')
>>> "%g" % Decimal("%g" % x)
'1.234e+22'
>>> "%f" % Decimal("%g" % x)
'12339999999999998951424.000000'

Nope. I can get an exact representation of the number I want with Decimal("%g" % x), but whenever I try to display it, it's either short or converted to a bad float before it's printed.

But what if I mixed Babel and Decimals?

>>> Decimal("%g" % 1.234e22)
Decimal('1.234E+22')
>>> dx = _
>>> format_number(dx, locale = "en_US")
Traceback (most recent call last):
...
TypeError: bad operand type for abs(): 'str'

Ouch. But Babel's got a function called format_decimal, let's use that instead:

>>> from babel.numbers import format_decimal
>>> format_decimal(dx, locale = "en_US")
Traceback (most recent call last):
...
TypeError: bad operand type for abs(): 'str'

Oops, format_decimal can't format python Decimals. :-(

Ok, one last idea: I could try converting to a long.

>>> x = 1.234e22
>>> long(x)
12339999999999998951424L
>>> long(Decimal(x))
12339999999999998951424L
>>> long(Decimal("%g" % x))
12340000000000000000000L

Yes! I've got the exact number I want to format. Let's give that to Babel:

>>> format_number(long(Decimal("%g" % x)), locale = "en_US")
u'12,339,999,999,999,998,951,424'

Oh, no... Apparently Babel converts the long to a float before trying to format it. I'm out of luck, and out of ideas. :-(

If you think that this is tough, then try answering the same question for x = 1.234e-22. So far all I can print is either the short form 1.234e-22 or 0.0!

I would prefer this:

>>> print magic_format(1.234e-22, locale="en_US")
0.0000000000000000000001234
>>> print magic_format(1.234e-22, locale="fr_FR")
0,0000000000000000000001234
>>> print magic_format(1.234e-22, locale="en_US", group_frac=True)
0.000,000,000,000,000,000,000,123,400
>>> print magic_format(1.234e-22, locale="fr_FR", group_frac=True)
0,000 000 000 000 000 000 000 123 400

I can imagine writing a little function that would parse "1.234e-22" and format it nicely, but I would have to know all about the rules of number localization, and I'd rather not reinvent the wheel, Babel is supposed to do that. What should I do?

Thanks for your help. :-)

desertnaut
  • 57,590
  • 26
  • 140
  • 166
MiniQuark
  • 46,633
  • 36
  • 147
  • 183
  • Here's something to get you started: `'%0200f' % 1.234e22`. Add commas and spaces to taste. – U2EF1 Jun 14 '13 at 18:28
  • Do you want the output to be a number as well or a string representation of the number? – Chris Hagmann Jun 14 '13 at 18:48
  • 2
    Try this: http://stackoverflow.com/questions/2663612/nicely-representing-a-floating-point-number-in-python – gilgil28 Jun 14 '13 at 19:04
  • @U2EF1: I think you meant '%.200f' % 1.234e22 (your code returns "0000[...]00012339999[...]8951424.000000"). This still has the rounding issue. – MiniQuark Jun 14 '13 at 20:20
  • @cdhagmann: I want a string representation of the number. – MiniQuark Jun 14 '13 at 20:21
  • @gilgil28: thanks for the link, I had seen other float-format related questions, but not this one, it's interesting. It does not handle l10n though, and that's the part I'm most relunctant to code myself, as it's usually really tough to get right for all languages (sure, I can hard code the rules for English and French, but I don't want to code the rules for 100 languages). Plus, they don't handle "friendly" rounding: f(3.14, 20) == "3.1400000000000001243". So out of my three wishes, they just have one: long, non-exponential format. – MiniQuark Jun 14 '13 at 20:33
  • @MiniQuark did you check the answers below? – Saullo G. P. Castro May 24 '14 at 10:21

1 Answers1

1

This takes a large chunk of code from selected answer from Nicely representing a floating-point number in python but incorporates Babel to handle L10N.

NOTE : Babel uses a weird unicode version of the space character for a lot of locales. Hence the if loop that mentions 'fr_FR' directly to convert it to a normal space character.

import locale
from babel.numbers import get_decimal_symbol,get_group_symbol
import decimal

# https://stackoverflow.com/questions/2663612/nicely-representing-a-floating-point-number-in-python/2663623#2663623
def float_to_decimal(f):
    # http://docs.python.org/library/decimal.html#decimal-faq
    "Convert a floating point number to a Decimal with no loss of information"
    n, d = f.as_integer_ratio()
    numerator, denominator = decimal.Decimal(n), decimal.Decimal(d)
    ctx = decimal.Context(prec=60)
    result = ctx.divide(numerator, denominator)
    while ctx.flags[decimal.Inexact]:
        ctx.flags[decimal.Inexact] = False
        ctx.prec *= 2
        result = ctx.divide(numerator, denominator)
    return result 

def f(number, sigfig):
    assert(sigfig>0)
    try:
        d=decimal.Decimal(number)
    except TypeError:
        d=float_to_decimal(float(number))
    sign,digits,exponent=d.as_tuple()
    if len(digits) < sigfig:
        digits = list(digits)
        digits.extend([0] * (sigfig - len(digits)))    
    shift=d.adjusted()
    result=int(''.join(map(str,digits[:sigfig])))
    # Round the result
    if len(digits)>sigfig and digits[sigfig]>=5: result+=1
    result=list(str(result))
    # Rounding can change the length of result
    # If so, adjust shift
    shift+=len(result)-sigfig
    # reset len of result to sigfig
    result=result[:sigfig]
    if shift >= sigfig-1:
        # Tack more zeros on the end
        result+=['0']*(shift-sigfig+1)
    elif 0<=shift:
        # Place the decimal point in between digits
        result.insert(shift+1,'.')
    else:
        # Tack zeros on the front
        assert(shift<0)
        result=['0.']+['0']*(-shift-1)+result
    if sign:
        result.insert(0,'-')
    return ''.join(result)

def magic_format(num, locale="en_US", group_frac=True):
    sep = get_group_symbol(locale)
    if sep == get_group_symbol('fr_FR'): 
        sep = ' '
    else:
        sep = str(sep)
    dec = str(get_decimal_symbol(locale))

    n = float(('%E' % num)[:-4:])
    sigfig = len(str(n)) - (1 if '.' in str(n) else 0) 

    s = f(num,sigfig)

    if group_frac:
        ans = ""
        if '.' not in s:
            point = None
            new_d = ""
            new_s = s[::-1]
        else:
            point = s.index('.')
            new_d = s[point+1::]
            new_s = s[:point:][::-1]
        for idx,char in enumerate(new_d):
            ans += char
            if (idx+1)%3 == 0 and (idx+1) != len(new_d): 
                ans += sep
        else: ans = ans[::-1] + (dec if point != None else '')
        for idx,char in enumerate(new_s):
            ans += char
            if (idx+1)%3 == 0 and (idx+1) != len(new_s): 
                ans += sep 
        else:
            ans = ans[::-1]
    else:
        ans = s
    return ans

This chuck of code can be used as follows:

>>> magic_format(num2, locale = 'fr_FR')
'0,000 000 000 000 000 000 000 123 456 0'
>>> magic_format(num2, locale = 'de_DE')
'0,000.000.000.000.000.000.000.123.456.0'
>>> magic_format(num2)
'0.000,000,000,000,000,000,000,123,456'
>>> f(num,6)
'12345600000000000000000'
>>> f(num2,6)
'0.000000000000000000000123456'

with the f function coming from the link.

Community
  • 1
  • 1
Chris Hagmann
  • 1,086
  • 8
  • 14
  • I just noticed that I made group_frac default to True and not to False. It work either way. – Chris Hagmann Jun 14 '13 at 20:28
  • Thanks! The unicode character is a non-breakable space (equivalent of HTML's ` `). It's actually useful, you don't want to remove it: it will prevent the number being split at the end of a line. – MiniQuark Jun 14 '13 at 22:10