Why do values values in a dataframe change when converting from int to float and back?

Question

I am getting very strange behavior for a series of SKUs in my code when using "astype(float)" and "astype(int) and I am at a loss to explain why. This seems to only happen on my local machine (I couldn't duplicate it in an online Juypter Notebook).

Here is a list of products where this problem occurs and the DF I am creating with them:

products = {'SKU': [1111000120,1111000160,1111000182,1111000210,1111001300,2412601027,
            2412601449,5172100236,5172100370,5172100713,7130104717]}


dfprod = pd.DataFrame.from_dict(products)

when I convert this df to fload and then back to int on my local machine I get the following: Conversion error

I found this question that treats a similar problem but is about C++ so I'm not too sure how applicable it is. sign changes when going from int to float and back

This is probably a case of integer overflow. The value probably exceed the maximum size of the integer type. — Abdou, May 11 '17 at 15:53
Makes sense. But why would it only happen on my local machine then? — Manuel Niederl, May 11 '17 at 16:01
Compare your machine's `maxsize` with that each value in that dataframe: `import sys; dfprod.SKU.apply(lambda x: sys.maxsize < float(x)).any()`? Maybe that `maxsize` is less than some of the values? — Abdou, May 11 '17 at 16:05
I just checked and unfortunately that's not the solution. The maxsize is: 9223372036854775807 — Manuel Niederl, May 11 '17 at 16:33
I am unable to reproduce this behavior on my machine. Are you by any chance using an old pandas version? In any case, `dfprod.astype(float).astype(np.uint64)` runs just fine on my machine with the use of numpy integer types. — Abdou, May 11 '17 at 17:10

Why do values values in a dataframe change when converting from int to float and back?

0 Answers0