I am trying to create a pd.DataFrame
, but I am having trouble getting the data type correct. I have two numpy
arrays that are type float
.
They were created from a list of coordinates (x & y) as seen here:
# Take coordinates from list and convert to a numpy array
x_vector = np.asarray(x_list, dtype=float)
y_vector = np.asarray(y_list, dtype=float)
For reference here is a sample of what x_vector
looks like:
[-2248925.48185815 -2248925.48185815 -2248080.13621823 -2262432.04991849
-2250570.32692157 -2237312.76315587 -2237312.76315587 -2245650.16260083
-2245650.16260083 -2249323.93572129 -2247050.83128422 -2253151.83634956]
I am pleased with the formatting here, the issue arises when I try to add x_vector
and y_vector
to the pandas data frame.
My logic is that I have 201 records of lat/lons so my index
equals that, then I add columns
corresponding to my data, finally I set the dtype
to match my coordinates (float).
Here is my code:
df = pd.DataFrame(index=range(1, 202, 1), columns=['lat', 'lon', 'ws_daily_max'], dtype=float)
df['lat'] = y_vector
df['lon'] = x_vector
However when I print df
to the console, I get these values where the decimal place shifted significantly. What went wrong, why did the lat/lon values change? I was expecting them to be the same as the float values above, i.e. (-2248925.48185815
)?
index lat lon ws_daily_max
1 1.895464e+06 -2.248925e+06 NaN
2 1.895464e+06 -2.248925e+06 NaN
I am truly confused as to what happened. No error message printed, but this is not the result I was expecting. Any clarity as to why and how to fix this would be greatly appreciated.
Help me, StackExchange. You're my only hope.