-1

I am trying to create a pd.DataFrame, but I am having trouble getting the data type correct. I have two numpy arrays that are type float.

They were created from a list of coordinates (x & y) as seen here:

# Take coordinates from list and convert to a numpy array
x_vector = np.asarray(x_list, dtype=float)
y_vector = np.asarray(y_list, dtype=float)

For reference here is a sample of what x_vector looks like:

[-2248925.48185815 -2248925.48185815 -2248080.13621823 -2262432.04991849
-2250570.32692157 -2237312.76315587 -2237312.76315587 -2245650.16260083
-2245650.16260083 -2249323.93572129 -2247050.83128422 -2253151.83634956]

I am pleased with the formatting here, the issue arises when I try to add x_vector and y_vector to the pandas data frame.

My logic is that I have 201 records of lat/lons so my index equals that, then I add columns corresponding to my data, finally I set the dtype to match my coordinates (float).

Here is my code:

df = pd.DataFrame(index=range(1, 202, 1), columns=['lat', 'lon', 'ws_daily_max'], dtype=float)

df['lat'] = y_vector
df['lon'] = x_vector

However when I print df to the console, I get these values where the decimal place shifted significantly. What went wrong, why did the lat/lon values change? I was expecting them to be the same as the float values above, i.e. (-2248925.48185815)?

index lat lon ws_daily_max 1 1.895464e+06 -2.248925e+06 NaN 2 1.895464e+06 -2.248925e+06 NaN

I am truly confused as to what happened. No error message printed, but this is not the result I was expecting. Any clarity as to why and how to fix this would be greatly appreciated.

Help me, StackExchange. You're my only hope.

Nikolai
  • 243
  • 3
  • 15

1 Answers1

3

This is the scientific notation of the same number. 1.895464e+06 means 1.895464*10^6 = 1895465. So the decimal place did not shift, just the representation. If you want to change the looks of the numbers, look at http://pandas.pydata.org/pandas-docs/stable/options.html. I hope this helps.

timm
  • 76
  • 1
  • 7
  • Good answer. Note that the accepted answer to [this question](http://stackoverflow.com/questions/21137150/format-suppress-scientific-notation-from-python-pandas-aggregation-results) shows exactly how to do so. – Ami Tavory Sep 02 '16 at 18:43
  • @AmiTavory interesting, although I agree with the accepted answer on that thread, converting to string for aesthetic purposes is not a best practice. – Nikolai Sep 02 '16 at 20:00
  • @Nikolai I believe the person answering there agreed with you completely on that. Note the first part of the answer, though - `pd.set_option('display.float_format', lambda x: '%.3f' % x)` - that just sets the display option. – Ami Tavory Sep 02 '16 at 20:03