I have a netCDF4.Variable
object:
<class 'netCDF4._netCDF4.Variable'>
int16 myvar(time, latitude, longitude)
standard_name: my_var
long_name: Something
units: (0 - 1)
add_offset: 0.499999843485
scale_factor: 1.54488841466e-05
_FillValue: -32767
missing_value: -32767
unlimited dimensions: time
current shape = (13148, 1441, 2880)
filling off
This variable is a 3D variable where the first dimension is a temporal dimension and the 2 others spatial dimension.
I would like to access a subset of this variable containing:
- A subset of the temporal range (e.g. from
7000
to8000
). - A subset of the points that are identified by indices in the flattened version of the spatial range - In the above example, indices would range between
0
and1441 * 2880
.
Basically, I have:
tmin = 7000
tmax = 8000
upts = [42829, 9289, 3242]
My current way of accessing this is:
z = np.zeros(len(upts), tmax - tmin)
for i in range(tmin, tmax):
z[:, i - tmin] = my_var[i, :, :].flatten()[upts]
I was wondering if there was a faster way to do this?
I cannot load the whole dataset in memory because it is already huge, and could be larger.
I also cannot work only with a single i
because I want to operate on row of z
(which corresponds to "columns" in my_var
).