8

I'm trying to convert a netCDF file to either a CSV or text file using Python. I have read this post but I am still missing a step (I'm new to Python). It's a dataset including latitude, longitude, time and precipitation data.

This is my code so far:

import netCDF4
import pandas as pd

precip_nc_file = 'file_path'
nc = netCDF4.Dataset(precip_nc_file, mode='r')

nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
precip = nc.variables['precip'][:]

I am not sure how to proceed from here, though I understand it's a matter of creating a dataframe with pandas.

aliki43
  • 161
  • 1
  • 2
  • 5

4 Answers4

16

I think pandas.Series should work for you to create a CSV with time, lat,lon,precip.

import netCDF4
import pandas as pd

precip_nc_file = 'file_path'
nc = netCDF4.Dataset(precip_nc_file, mode='r')

nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
precip = nc.variables['precip'][:]

# a pandas.Series designed for time series of a 2D lat,lon grid
precip_ts = pd.Series(precip, index=dtime) 

precip_ts.to_csv('precip.csv',index=True, header=True)
Eric Bridger
  • 3,751
  • 1
  • 19
  • 34
7
import xarray as xr

nc = xr.open_dataset('file_path')
nc.precip.to_dataframe().to_csv('precip.csv')
Robert Davy
  • 866
  • 5
  • 13
  • Can you provide more clarity of why you wrote this code compared to the question being asked? How does this code give an answer to the question? – denis_lor Oct 11 '19 at 09:03
  • 1
    The popular answer didn't work for me. I am not sure why (maybe something I did wrong?). It seems that xarray library provides a solution in fewer lines of code. This alternative may save time for some people, as it did for me. – Robert Davy Oct 11 '19 at 19:21
  • Regarding the other answer which didn't work. I tried it on a standard NOAA mslp netCDF file, https://www.esrl.noaa.gov/psd/thredds/fileServer/Datasets/ncep.reanalysis2/surface/mslp.2018.nc, and obtained the following error at the 2nd last line: – Robert Davy Oct 12 '19 at 06:07
  • >>> # a pandas.Series designed for time series of a 2D lat,lon grid ... precip_ts = pd.Series(precip, index=dtime) Traceback (most recent call last): File "", line 2, in File "C:\Users\dav500\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\series.py", line 262, in __init__ raise_cast_failure=True) File "C:\Users\dav500\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 658, in sanitize_array raise Exception('Data must be 1-dimensional') Exception: Data must be 1-dimensional – Robert Davy Oct 12 '19 at 06:10
  • Thank you so much this worked for me so perfectly! I have struggled for days but you saved me! – Jane Kathambi Jul 28 '21 at 14:54
2

Depending on your requirements, you may be able to use Numpy's savetxt method:

import numpy as np

np.savetxt('lat.csv', lat, delimiter=',')
np.savetxt('lon.csv', lon, delimiter=',')
np.savetxt('precip.csv', precip, delimiter=',')

This will output the data without any headings or index column, however.

If you do need those features, you can construct a DataFrame and save it as CSV as follows:

df_lat = pd.DataFrame(data=lat, index=dtime)
df_lat.to_csv('lat.csv')

# and the same for `lon` and `precip`.

Note: here, I assume that the date/time index runs along the first dimension of the data.

Mac
  • 14,615
  • 9
  • 62
  • 80
  • Thanks! Unfortunately this didn't work - I decided to just extract all the latitudes and longitudes I was using in my other dataset, and looped over that to get the time series of each place. Like in the link I provided above. Time consuming, but it works! – aliki43 Jun 05 '17 at 14:28
0

alternative to xarray library:

import netCDF4
precip_nc_file = r'file_path\file_name.nc'
nc = netCDF4.Dataset(precip_nc_file, mode='r')
cols = list(nc.variables.keys())
list_nc = []
for c in cols:
    list_nc.append(list(nc.variables[c][:]))
df_nc = pd.DataFrame(list_nc)
df_nc = df_nc.T
df_nc.columns = cols
df_nc.to_csv("file_path.csv", index = False)
Manav Patadia
  • 848
  • 7
  • 12