87

I want to print some floating point numbers so that they're always written in decimal form (e.g. 12345000000000000000000.0 or 0.000000000000012345, not in scientific notation, yet I'd want to the result to have the up to ~15.7 significant figures of a IEEE 754 double, and no more.

What I want is ideally so that the result is the shortest string in positional decimal format that still results in the same value when converted to a float.

It is well-known that the repr of a float is written in scientific notation if the exponent is greater than 15, or less than -4:

>>> n = 0.000000054321654321
>>> n
5.4321654321e-08  # scientific notation

If str is used, the resulting string again is in scientific notation:

>>> str(n)
'5.4321654321e-08'

It has been suggested that I can use format with f flag and sufficient precision to get rid of the scientific notation:

>>> format(0.00000005, '.20f')
'0.00000005000000000000'

It works for that number, though it has some extra trailing zeroes. But then the same format fails for .1, which gives decimal digits beyond the actual machine precision of float:

>>> format(0.1, '.20f')
'0.10000000000000000555'

And if my number is 4.5678e-20, using .20f would still lose relative precision:

>>> format(4.5678e-20, '.20f')
'0.00000000000000000005'

Thus these approaches do not match my requirements.


This leads to the question: what is the easiest and also well-performing way to print arbitrary floating point number in decimal format, having the same digits as in repr(n) (or str(n) on Python 3), but always using the decimal format, not the scientific notation.

That is, a function or operation that for example converts the float value 0.00000005 to string '0.00000005'; 0.1 to '0.1'; 420000000000000000.0 to '420000000000000000.0' or 420000000000000000 and formats the float value -4.5678e-5 as '-0.000045678'.


After the bounty period: It seems that there are at least 2 viable approaches, as Karin demonstrated that using string manipulation one can achieve significant speed boost compared to my initial algorithm on Python 2.

Thus,

Since I am primarily developing on Python 3, I will accept my own answer, and shall award Karin the bounty.

  • And please if you do have a better answer to this question, do share it. – Antti Haapala -- Слава Україні Aug 09 '16 at 10:28
  • 3
    Project for a rainy day: add a low-level library function to Python (possibly in the `sys` module) that returns the "raw" binary-to-decimal conversion result for a given finite float (i.e., string of digits, decimal exponent, sign). That would give people the freedom to format as they saw fit. – Mark Dickinson Aug 11 '16 at 07:21
  • 3
    Short answer: no, there isn't an easier way to do this; at least, not one that I'm aware of, and that also gives decently precise results. (Any solution that involves first pre-processing the number by scaling by powers of 10 is going to risk introducing numerical errors.) – Mark Dickinson Aug 12 '16 at 07:48
  • since you required precision is 15.7 decimal digits ~= 16 decimal digits of precision why your examples request precision 20? – kederrac Sep 08 '19 at 19:40
  • The 20 isn't precision but scale! – Antti Haapala -- Слава Україні Sep 08 '19 at 19:44
  • you wrote `yet I'd want to keep the 15.7 decimal digits of precision and no more.` , also there is no such term like `scale` when you speak about format function even you may refer to the same thing: The precision is a decimal number indicating how many digits should be displayed after the decimal point for a floating point value – kederrac Sep 08 '19 at 19:55
  • @AnttiHaapala in all your example the precision should be 16 not 20, also please remove 4.5678e-20 from your examples since your requested precision is bellow 16 – kederrac Sep 09 '19 at 05:41
  • @rusu_ro1 There are 2 different formats at play here: the source, which is floating point, and the destination fixed point representation. To me it seems that OP is interested in preserving the **source** format's precision to the given spec. – Ilja Everilä Sep 09 '19 at 05:55
  • thx @IljaEverilä, this means that the OP post is incomplete? – kederrac Sep 09 '19 at 05:57
  • I don't see it that way, it just takes a bit of reading. – Ilja Everilä Sep 09 '19 at 06:06

7 Answers7

70

Unfortunately it seems that not even the new-style formatting with float.__format__ supports this. The default formatting of floats is the same as with repr; and with f flag there are 6 fractional digits by default:

>>> format(0.0000000005, 'f')
'0.000000'

However there is a hack to get the desired result - not the fastest one, but relatively simple:

  • first the float is converted to a string using str() or repr()
  • then a new Decimal instance is created from that string.
  • Decimal.__format__ supports f flag which gives the desired result, and, unlike floats it prints the actual precision instead of default precision.

Thus we can make a simple utility function float_to_str:

import decimal

# create a new context for this task
ctx = decimal.Context()

# 20 digits should be enough for everyone :D
ctx.prec = 20

def float_to_str(f):
    """
    Convert the given float to a string,
    without resorting to scientific notation
    """
    d1 = ctx.create_decimal(repr(f))
    return format(d1, 'f')

Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context but it would be slower, creating a new thread-local context and a context manager for each conversion.

This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:

>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'

The last result is rounded at the last digit

As @Karin noted, float_to_str(420000000000000000.0) does not strictly match the format expected; it returns 420000000000000000 without trailing .0.

Community
  • 1
  • 1
37

If you are satisfied with the precision in scientific notation, then could we just take a simple string manipulation approach? Maybe it's not terribly clever, but it seems to work (passes all of the use cases you've presented), and I think it's fairly understandable:

def float_to_str(f):
    float_string = repr(f)
    if 'e' in float_string:  # detect scientific notation
        digits, exp = float_string.split('e')
        digits = digits.replace('.', '').replace('-', '')
        exp = int(exp)
        zero_padding = '0' * (abs(int(exp)) - 1)  # minus 1 for decimal point in the sci notation
        sign = '-' if f < 0 else ''
        if exp > 0:
            float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
        else:
            float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
    return float_string

n = 0.000000054321654321
assert(float_to_str(n) == '0.000000054321654321')

n = 0.00000005
assert(float_to_str(n) == '0.00000005')

n = 420000000000000000.0
assert(float_to_str(n) == '420000000000000000.0')

n = 4.5678e-5
assert(float_to_str(n) == '0.000045678')

n = 1.1
assert(float_to_str(n) == '1.1')

n = -4.5678e-5
assert(float_to_str(n) == '-0.000045678')

Performance:

I was worried this approach may be too slow, so I ran timeit and compared with the OP's solution of decimal contexts. It appears the string manipulation is actually quite a bit faster. Edit: It appears to only be much faster in Python 2. In Python 3, the results were similar, but with the decimal approach slightly faster.

Result:

  • Python 2: using ctx.create_decimal(): 2.43655490875

  • Python 2: using string manipulation: 0.305557966232

  • Python 3: using ctx.create_decimal(): 0.19519368198234588

  • Python 3: using string manipulation: 0.2661344590014778

Here is the timing code:

from timeit import timeit

CODE_TO_TIME = '''
float_to_str(0.000000054321654321)
float_to_str(0.00000005)
float_to_str(420000000000000000.0)
float_to_str(4.5678e-5)
float_to_str(1.1)
float_to_str(-0.000045678)
'''
SETUP_1 = '''
import decimal

# create a new context for this task
ctx = decimal.Context()

# 20 digits should be enough for everyone :D
ctx.prec = 20

def float_to_str(f):
    """
    Convert the given float to a string,
    without resorting to scientific notation
    """
    d1 = ctx.create_decimal(repr(f))
    return format(d1, 'f')
'''
SETUP_2 = '''
def float_to_str(f):
    float_string = repr(f)
    if 'e' in float_string:  # detect scientific notation
        digits, exp = float_string.split('e')
        digits = digits.replace('.', '').replace('-', '')
        exp = int(exp)
        zero_padding = '0' * (abs(int(exp)) - 1)  # minus 1 for decimal point in the sci notation
        sign = '-' if f < 0 else ''
        if exp > 0:
            float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
        else:
            float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
    return float_string
'''

print(timeit(CODE_TO_TIME, setup=SETUP_1, number=10000))
print(timeit(CODE_TO_TIME, setup=SETUP_2, number=10000))
Karin
  • 8,404
  • 25
  • 34
  • You could actually specify the initialization (`def format_float`; `import decimal; ctx = ...`) as the second argument to `timeit`; that way it doesn't get included to the measurements. – Antti Haapala -- Слава Україні Aug 17 '16 at 04:32
  • 2
    Ahh that seems obvious from the docs now. Great to know! I've updated my timing code and it looks much cleaner now thanks to you :) – Karin Aug 17 '16 at 04:39
  • I need to add one more case to test, though. The number can be negative, your's still calculates `n = -4.5678e-5`; `assert(format_float(n) == '-0.000045678')` incorrectly :D – Antti Haapala -- Слава Україні Aug 17 '16 at 13:12
  • And another gotcha more: This is way faster on Python 2 than my code, but slower on Python 3; seems that in Python 3 the decimal constructor performs much better than in Python 2. – Antti Haapala -- Слава Україні Aug 17 '16 at 14:51
  • 2
    I'm consistently surprised how often the naive "just stringify it" approach works, and sometimes works even better than other cases. – Wayne Werner Aug 17 '16 at 14:52
  • @Antti Fascinating! I can confirm your approach is must faster in Python 3 than Python 2. Another weirdness though, is that the `420000000000000000.0` use case actually fails for me for your decimal approach in Python 2 and 3. Very strange =\ – Karin Aug 17 '16 at 15:35
  • @Karin it is because it seems that if `decimal` has more than 16 places, there is no `.0` any longer. – Antti Haapala -- Слава Україні Aug 17 '16 at 16:48
  • @Antti But then how did it work for you in your answer's example usage? – Karin Aug 17 '16 at 17:25
  • 1
    Frankly, I didn't remember that the returned string was without `.0`, I didn't copy-paste my example output from Python shell, instead writing it here. Good catch :D I fixed my answer. – Antti Haapala -- Слава Україні Aug 17 '16 at 17:26
  • 1
    `decimal` has received [several speed improvements in Python 3.3](https://docs.python.org/3/whatsnew/3.3.html#decimal) (switch to libmpdec, caching, etc.) leading to 10x - 100x performance gains depending on what you are trying to make it do. – Martijn Pieters Aug 17 '16 at 17:56
  • Karin, not only were you the only answerer to understand what I sought for, but you also found a clever approach to achieve it using string manipulation that performs very well on Python 2. :D Thus I awarded the bounty to you. However, I chose to accept my self-answer in this case since the project for which we needed this uses Python 3, and we're already successfully using my approach. – Antti Haapala -- Слава Україні Aug 18 '16 at 18:24
  • (Ah one more thing, this should be using `repr` instead of `str` to [get consistent results Python 2 vs 3.](http://stackoverflow.com/questions/38847690/convert-float-to-string-without-scientific-notation-and-false-precision/38847691?noredirect=1#comment65401025_38847691)) – Antti Haapala -- Слава Україні Aug 18 '16 at 18:31
  • 1
    @Antti Thanks! This was a fun use case :) Also updated my code to use `repr` as suggested. – Karin Aug 19 '16 at 01:45
  • Good answer, but to be honest, I feel like this should be implemented in python directly and doable through `.format`. I don't see why `.format` doesn't include this use case. Printing a number in non-scientific notation with significant figures for example requires a hack like this. Yet I imagine it's an extremely common use case for plotting scientific figures with short logarithmic scales. – Marses Oct 28 '19 at 12:17
  • still not working for `float_to_str(333333333333333333333333333333333333333333333333333333333333333333333333333333.333333333333333333333333333333333333333333333333333333333333333)` – recolic Jul 22 '20 at 15:00
  • this only works when exp in float_to_str() is <0. The 3rd test case happens to work because there is only one decimal digit in the scientific notation. It won't work if there are more than 1. (n = 421000000000000000.0 will not work) – fivelements Jun 02 '21 at 12:46
  • why there is wrong result : float_to_str(27052805291130213231.64) Out[440]: '27052805291130212000000000000000000.0' – CS QGB Aug 24 '23 at 08:37
30

As of NumPy 1.14.0, you can just use numpy.format_float_positional. For example, running against the inputs from your question:

>>> numpy.format_float_positional(0.000000054321654321)
'0.000000054321654321'
>>> numpy.format_float_positional(0.00000005)
'0.00000005'
>>> numpy.format_float_positional(0.1)
'0.1'
>>> numpy.format_float_positional(4.5678e-20)
'0.000000000000000000045678'

numpy.format_float_positional uses the Dragon4 algorithm to produce the shortest decimal representation in positional format that round-trips back to the original float input. There's also numpy.format_float_scientific for scientific notation, and both functions offer optional arguments to customize things like rounding and trimming of zeros.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 2
    Hey, that's nice. Not practical if NumPy is not needed otherwise, but if it is this is definitely what one should be using. – Antti Haapala -- Слава Україні Mar 23 '19 at 08:19
  • 1
    Even better answer. Though my opinion is that this functionality should be included directly as an option in the `.format` method for strings. Decimal representations with a significant figure limit are an extremely common use case in scientific graphs with logarithmic scales. – Marses Oct 28 '19 at 12:21
  • numpy.format_float_positional(27052805291130213231.64)=='27052805291130212000.' – CS QGB Aug 24 '23 at 08:36
  • @CSQGB: That's normal. Floats don't have enough precision to represent all the digits of 27052805291130213231.64. The value gets rounded to a float with exact numeric value 27052805291130212352, and `'27052805291130212000.'` is the shortest (in terms of minimum significant digits) decimal representation in positional format that produces that same float. – user2357112 Aug 24 '23 at 09:20
  • Note that the question specifically wanted to avoid reporting false precision, and asked for "the result to have the up to ~15.7 significant figures of a IEEE 754 double, and no more". Returning `'27052805291130212352.'` would go against what the question asked for (and the float doesn't contain enough information to return `'27052805291130213231.64'`). – user2357112 Aug 24 '23 at 09:26
6

If you are ready to lose your precision arbitrary by calling str() on the float number, then it's the way to go:

import decimal

def float_to_string(number, precision=20):
    return '{0:.{prec}f}'.format(
        decimal.Context(prec=100).create_decimal(str(number)),
        prec=precision,
    ).rstrip('0').rstrip('.') or '0'

It doesn't include global variables and allows you to choose the precision yourself. Decimal precision 100 is chosen as an upper bound for str(float) length. The actual supremum is much lower. The or '0' part is for the situation with small numbers and zero precision.

Note that it still has its consequences:

>> float_to_string(0.10101010101010101010101010101)
'0.10101010101'

Otherwise, if the precision is important, format is just fine:

import decimal

def float_to_string(number, precision=20):
    return '{0:.{prec}f}'.format(
        number, prec=precision,
    ).rstrip('0').rstrip('.') or '0'

It doesn't miss the precision being lost while calling str(f). The or

>> float_to_string(0.1, precision=10)
'0.1'
>> float_to_string(0.1)
'0.10000000000000000555'
>>float_to_string(0.1, precision=40)
'0.1000000000000000055511151231257827021182'

>>float_to_string(4.5678e-5)
'0.000045678'

>>float_to_string(4.5678e-5, precision=1)
'0'

Anyway, maximum decimal places are limited, since the float type itself has its limits and cannot express really long floats:

>> float_to_string(0.1, precision=10000)
'0.1000000000000000055511151231257827021181583404541015625'

Also, whole numbers are being formatted as-is.

>> float_to_string(100)
'100'
gukoff
  • 2,112
  • 3
  • 18
  • 30
3

I think rstrip can get the job done.

a=5.4321654321e-08
'{0:.40f}'.format(a).rstrip("0") # float number and delete the zeros on the right
# '0.0000000543216543210000004442039220863003' # there's roundoff error though

Let me know if that works for you.

silgon
  • 6,890
  • 7
  • 46
  • 67
2

Interesting question, to add a little bit more of content to the question, here's a litte test comparing @Antti Haapala and @Harold solutions outputs:

import decimal
import math

ctx = decimal.Context()


def f1(number, prec=20):
    ctx.prec = prec
    return format(ctx.create_decimal(str(number)), 'f')


def f2(number, prec=20):
    return '{0:.{prec}f}'.format(
        number, prec=prec,
    ).rstrip('0').rstrip('.')

k = 2*8

for i in range(-2**8,2**8):
    if i<0:
        value = -k*math.sqrt(math.sqrt(-i))
    else:
        value = k*math.sqrt(math.sqrt(i))

    value_s = '{0:.{prec}E}'.format(value, prec=10)

    n = 10

    print ' | '.join([str(value), value_s])
    for f in [f1, f2]:
        test = [f(value, prec=p) for p in range(n)]
        print '\t{0}'.format(test)

Neither of them gives "consistent" results for all cases.

  • With Anti's you'll see strings like '-000' or '000'
  • With Harolds's you'll see strings like ''

I'd prefer consistency even if I'm sacrificing a little bit of speed. Depends which tradeoffs you want to assume for your use-case.

BPL
  • 9,632
  • 9
  • 59
  • 117
0

using format(float, ' .f '):

old = 0.00000000000000000000123
if str(old).__contains__('e-'):
    float_length = str(old)[-2:]
    new=format(old,'.'+str(float_length)+'f')
    print(old)
    print(new)