0

I am trying to convert NetCDF to .csv much like this post. I am using a netCDF file with similar variables: 'time', 'lat', 'lon', 'total'

I've reproduced the top answer's code:

import netCDF4
import pandas as pd

file = 'file_path'
nc = netCDF4.Dataset(file, mode='r')

nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
total = nc.variables['total'][:]

total_ts = pd.Series(total, index=dtime) 
total_ts.to_csv('total.csv',index=True, header=True)

however I am getting 2 errors:

UserWarning: WARNING: valid_range not used since it cannot be safely cast to variable data type
dtime = netCDF4.num2date(time_var[:],time_var.units)

and

total_ts = pd.Series(total,index=dtime)
Exception: Data must be 1-dimensional

I am not sure what went wrong since the code is exactly the same and the netCDF file is very similar.

plummms
  • 25
  • 10
  • Can you give us the output of `ncdump -h file_path`? – msi_gerva Oct 08 '18 at 07:36
  • Hi, the full output is too long for a comment but I can post the important details: dimensions: time = 1 ; lat = 29 ; lon = 18 ; variables: float total(time, lat, lon), double lat(lat), double lon(lon), int time(time) Let me know if you need more info @msi_gerva, thanks! – plummms Oct 13 '18 at 17:24
  • I was most interested in the time variable and the units of time. In any case, to me it seems strange to use integer as a type for time. I would expect double kind of variable here... The second error is also clear now - Pandas is expecting 1D array for total_ts, but you are giving it a 3D array with dimensions (time,lat,lon). You could get rid of the second error by `total = total.flatten()` provided that the total is NumPy array. – msi_gerva Oct 13 '18 at 17:55
  • As requested, along with other details: time:units = "minutes since 2016-01-01 00:30:00" ; time:time_increment = 60000 ; time:begin_date = 20160101 ; time:begin_time = 3000 ; Thanks for explaining! You were right that the exception went away when I flattened it, unfortunately it got replaced with a ValueError: Length of passed values is 522, index implies 1 – plummms Oct 13 '18 at 18:42
  • I guess the last error is because if you flatten your data, the length of the time data and the total data does not match - one has number of values as the length of time dimension and other as the product of time, lon and lat dimensions. Anyhow, I am not sure what is your aim with the data. For me it does not make sense to convert (time,lat,lon) data to one Pandas table and I would rather work with the NumPy array with (time,lat,lon) dimensions. If you are to use just one timeserie, then it makes sense to use Pandas table for it. – msi_gerva Oct 14 '18 at 17:09
  • I see, thanks for that. My aim was to make a table with the 4 variables as columns- basically for every time, lat, lon, there would also be a value for total. – plummms Oct 15 '18 at 00:48

0 Answers0