values in the array change when turned to numpy array

Question

I have data stored in a pandas DataFrame that I move to a numpy array using the following code

# used to be train_X = np.array(train_df.iloc[1:,3:].values.tolist())
# but was split for me to find he source of change  
pylist = train_df.iloc[1:,3:].values.tolist()
print(pylist[0])
train_X = np.array(pylist)
print(train_X[0])

the first print returns :

[0.0, 0.0, 0.0, 0.0, 1.0, 504.0, 0.0, 2.0, 8.0, 0.0, 0.0, 0.0, 0.0, 2.0, 8.0, 0.0, 189.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 85143.0, 57219.0, 62511.267857142804, 2649.26669430866]

the second print after the I move it to a Numpy array returns this

[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 1.00000000e+00 5.04000000e+02 0.00000000e+00 2.00000000e+00
 8.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 2.00000000e+00 8.00000000e+00 0.00000000e+00
 1.89000000e+02 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 8.51430000e+04 5.72190000e+04
 6.25112679e+04 2.64926669e+03]

why does this happen ? and how do I stop it

See https://stackoverflow.com/questions/2891790/how-to-pretty-print-a-numpy-array-without-scientific-notation-and-with-given-pre — Dany Yatim, Jan 06 '20 at 14:48
The data is the same, don't worry. NumPy just switches the representation of the whole array to exponential notation when some values in it are over a certain threshold. — Seb, Jan 06 '20 at 14:50

score 1 · Accepted Answer · answered Jan 06 '20 at 14:53

1

As mentioned in the comments, NumPy represents the data to exponential notation. If you would like to change the way it's printed, you can do:

import numpy as np

np.set_printoptions(precision=2)
pylist = train_df.iloc[1:,3:].values.tolist()
print(pylist[0])
train_X = np.array(pylist)
print(train_X[0])

answered Jan 06 '20 at 14:53

velociraptor11

576
1
5
10

thanks I tried showing them without print , and you're right – Mai Jan 06 '20 at 16:11
I did , but since I don't post a lot it won't display it XD – Mai Jan 06 '20 at 17:04

Vijayant · Answer 2 · 2020-01-06T14:57:33.427

0

This happens because numpy provides the full notation of a numeric value as compared to pandas. You can use the method np.setprint_oprtions(precision=2)

edited Jan 06 '20 at 14:57

answered Jan 06 '20 at 14:49

Vijayant

30
5

values in the array change when turned to numpy array

2 Answers2