This is my first time using netCDF and I'm trying to wrap my head around working with it.
I have multiple version 3 netcdf files (NOAA NARR air.2m daily averages for an entire year). Each file spans a year between 1979 - 2012. They are 349 x 277 grids with approximately a 32km resolution. Data was downloaded from here.
The dimension is time (hours since 1/1/1800) and my variable of interest is air. I need to calculate accumulated days with a temperature < 0. For example
Day 1 = +4 degrees, accumulated days = 0
Day 2 = -1 degrees, accumulated days = 1
Day 3 = -2 degrees, accumulated days = 2
Day 4 = -4 degrees, accumulated days = 3
Day 5 = +2 degrees, accumulated days = 0
Day 6 = -3 degrees, accumulated days = 1
I need to store this data in a new netcdf file. I am familiar with Python and somewhat with R. What is the best way to loop through each day, check the previous days value, and based on that, output a value to a new netcdf file with the exact same dimension and variable.... or perhaps just add another variable to the original netcdf file with the output I'm looking for.
Is it best to leave all the files separate or combine them? I combined them with ncrcat and it worked fine, but the file is 2.3gb.
Thanks for the input.
My current progress in python:
import numpy
import netCDF4
#Change my working DIR
f = netCDF4.Dataset('air7912.nc', 'r')
for a in f.variables:
print(a)
#output =
lat
long
x
y
Lambert_Conformal
time
time_bnds
air
f.variables['air'][1, 1, 1]
#Output
298.37473
To help me understand this better what type of data structure am I working with? Is ['air'] the key in the above example and [1,1,1] are also keys? to get the value of 298.37473. How can I then loop through [1,1,1]?