Difference of precision/display between numpy.tolist() and list()

Question

This is kind of a follow up to coldspeed's question.

(And this is not a duplicate of is floating point math broken ? BTW)

I'm converting a list of lists to a numpy array, and then trying to convert it back to a python list of lists.

import numpy as np

x = [[  1.00000000e+00,   6.61560000e-13],
       [  2.00000000e+00,   3.05350000e-13],
       [  3.00000000e+00,   6.22240000e-13],
       [  4.00000000e+00,   3.08850000e-13],
       [  5.00000000e+00,   1.11170000e-10],
       [  6.00000000e+00,   3.82440000e-11],
       [  7.00000000e+00,   5.39160000e-11],
       [  8.00000000e+00,   1.75910000e-11],
       [  9.00000000e+00,   2.27330000e-10]]

x=np.array(x,np.float)
print([y.tolist() for y in x])
print([list(y) for y in x])

Result:

[[1.0, 6.6156e-13], [2.0, 3.0535e-13], [3.0, 6.2224e-13], [4.0, 3.0885e-13], [5.0, 1.1117e-10], [6.0, 3.8244e-11], [7.0, 5.3916e-11], [8.0, 1.7591e-11], [9.0, 2.2733e-10]]
[[1.0, 6.6155999999999996e-13], [2.0, 3.0535000000000001e-13], [3.0, 6.2223999999999998e-13], [4.0, 3.0884999999999999e-13], [5.0, 1.1117e-10], [6.0, 3.8243999999999997e-11], [7.0, 5.3915999999999998e-11], [8.0, 1.7591e-11], [9.0, 2.2733e-10]]

Note that trying to match python native types also fails (same behavior):

x=np.array(x,dtype=float)

So converting the lists back to normal python lists using numpy.tolist preserves values, whereas forcing iteration by calling list on them introduces rounding errors.

Fun fact:

str([y.tolist() for y in x])==str([list(y) for y in x]) yields False (as expected, different printouts)
[y.tolist() for y in x]==[list(y) for y in x] yields True (what the hell??)

Any thoughts? (using python 3.4 64 bits windows)

Equality testing at least recognises that the values are the same, even if their representation is muddy. What Python version is this on? — Martijn Pieters, Aug 15 '17 at 08:08
@JRichardSnape agreed but both `y.tolist()` and `list(y)` have the same type so they should be represented the exact same way — Jean-François Fabre, Aug 15 '17 at 08:11
Fair point, you can check the lists themselves, they are exactly the same. Musing on why the representation would be different. BTW - repros on 2.7 as well (I checked as I usually do with these) — J Richard Snape, Aug 15 '17 at 08:12
@Jean-FrançoisFabre: I was able to repro on OS X with 3.6.2 as well. — Martijn Pieters, Aug 15 '17 at 08:14
`list` iterates on the first dimension of the array (try `list(x)`). `tolist` iterates through all dimensions, returning a (nested) list of Python objects. Since `for y in x;` is just as good as `for y in list(x):`, we rarely need to use `list`. `tolist` is much more useful. — hpaulj, Aug 15 '17 at 16:43

jotasi · Accepted Answer · 2017-08-15T14:16:52.920

3

The reason for this is that the two methods produce different types that have different string representations even when holding the same value. Calling np.tolist converts the elements of the array to float data type while calling list is not changing the data type resulting in numpy.float64s:

import numpy as np

x = [[  1.00000000e+00,   6.61560000e-13],
       [  2.00000000e+00,   3.05350000e-13],
       [  3.00000000e+00,   6.22240000e-13],
       [  4.00000000e+00,   3.08850000e-13],
       [  5.00000000e+00,   1.11170000e-10],
       [  6.00000000e+00,   3.82440000e-11],
       [  7.00000000e+00,   5.39160000e-11],
       [  8.00000000e+00,   1.75910000e-11],
       [  9.00000000e+00,   2.27330000e-10]]

x=np.array(x,np.float)

print(type(x[0].tolist()[0]))     # `float`
print(type(list(x[0])[0]))        # `numpy.float64`

As those have different string representations (float getting rounded, while numpy.float64 printing the full precision), different results are printed and the comparison of str([y.tolist() for y in x])==str([list(y) for y in x]) fails, while the value wise comparison passes.

edited Aug 15 '17 at 14:16

answered Aug 15 '17 at 08:13

jotasi

5,077
2
29
51

good. Any way to store the data in `numpy` array exactly like a python array? – Jean-François Fabre Aug 15 '17 at 08:14
agreed but doing that changes nothing. – Jean-François Fabre Aug 15 '17 at 08:16
Oh, yes I just realized that. – jotasi Aug 15 '17 at 08:16
My *guess* is that `tolist()` does this deliberately in case you are passing the produced list to code that cannot be guaranteed to be aware of numpy types. Not sure whether that's a good idea. Nice work @jotasi. – J Richard Snape Aug 15 '17 at 08:16
2

Yup - here we are - "Data items are converted to the nearest compatible Python type." https://github.com/numpy/numpy/blob/75545583a89647b810862076ae385a6c396e3eb0/numpy/ma/core.py#L5761 – J Richard Snape Aug 15 '17 at 08:20
Thanks @JRichardSnape. That seems to be the logical explanation. – jotasi Aug 15 '17 at 08:23
@Jean-FrançoisFabre Btw, it works if you use `object` as `dtype` of the array. Then you always get the rounded string conversion. – jotasi Aug 15 '17 at 14:16

Difference of precision/display between numpy.tolist() and list()

1 Answers1