9

To see how repr(x) works for float in CPython, I checked the source code for float_repr:

buf = PyOS_double_to_string(PyFloat_AS_DOUBLE(v),
                            'r', 0,
                            Py_DTSF_ADD_DOT_0,
                            NULL);

This calls PyOS_double_to_string with format code 'r' which seems to be translated to format code 'g' with precision set to 17:

precision = 17;
format_code = 'g';

So I'd expect repr(x) and f'{x:.17g}' to return the same representation. However this doesn't seem to be the case:

>>> repr(1.1)
'1.1'
>>> f'{1.1:.17g}'
'1.1000000000000001'
>>> 
>>> repr(1.225)
'1.225'
>>> f'{1.225:.17g}'
'1.2250000000000001'

I understand that repr only needs to return as many digits as are necessary to reconstruct the exact same object as represented in memory and hence '1.1' is obviously sufficient to get back 1.1 but I'd like to know how (or why) this differs from the (internally used) .17g formatting option.

(Python 3.7.3)

a_guest
  • 34,165
  • 12
  • 64
  • 118

1 Answers1

7

Seems that you're looking at a fallback method:

/* The fallback code to use if _Py_dg_dtoa is not available. */

PyAPI_FUNC(char *) PyOS_double_to_string(double val,
                                         char format_code,
                                         int precision,
                                         int flags,
                                         int *type)
{
    char format[32];

The preprocessor variable that conditions the fallback method is PY_NO_SHORT_FLOAT_REPR. If it's set then dtoa won't be compiled as it will fail:

/* if PY_NO_SHORT_FLOAT_REPR is defined, then don't even try to compile the following code */

It's probably not the case on most modern setups. This Q&A explains when/why Python selects either method: What causes Python's float_repr_style to use legacy?

now at line 947 you have the version where _Py_dg_dtoa is available

/* _Py_dg_dtoa is available. */


static char *
format_float_short(double d, char format_code,
                   int mode, int precision,
                   int always_add_sign, int add_dot_0_if_integer,
                   int use_alt_formatting, const char * const *float_strings,
                   int *type)

and there you can see that g and r have subtle differences (explained in comments)

We used to convert at 1e17, but that gives odd-looking results for some values when a 16-digit 'shortest' repr is padded with bogus zeros.

case 'g':
    if (decpt <= -4 || decpt >
        (add_dot_0_if_integer ? precision-1 : precision))
        use_exp = 1;
    if (use_alt_formatting)
        vdigits_end = precision;
    break;
case 'r':
    /* convert to exponential format at 1e16.  We used to convert
       at 1e17, but that gives odd-looking results for some values
       when a 16-digit 'shortest' repr is padded with bogus zeros.
       For example, repr(2e16+8) would give 20000000000000010.0;
       the true value is 20000000000000008.0. */
    if (decpt <= -4 || decpt > 16)
        use_exp = 1;
    break;

Seems that it matches the behaviour you're describing. note that "{:.16g}".format(1.225) yields 1.225

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • That seems like a good hint. Do you know by chance, what this `_Py_dg_dtoa` is (it seems to be part of [dtoa.c](https://github.com/python/cpython/blob/v3.7.3/Python/dtoa.c)) and under what circumstances it is available or not? Also how could I check whether it is available for my particular Python install (it seems to be as per your answer but I'd like to run a crosscheck and understand more of the corresponding internal workings). Thanks. – a_guest Sep 08 '19 at 20:07
  • I found a link explaining more in detail what makes python select one or the other method. See my edit – Jean-François Fabre Sep 08 '19 at 20:14
  • 1
    @a_guest: For runtime checking, use `sys.float_repr_style`; that's exactly its intended use-case. I'd love to get to a place where we can just get rid of the fallback code altogether, but that would mean requiring IEEE 754 format doubles for CPython, which is controversial. – Mark Dickinson Sep 09 '19 at 15:25