2

I have two netcdf files from WRF runs, one with hourly data and the other smaller file has the coordinates (XLAT and XLONG). I am trying to retrieve a subset of the data based on certain coordinates.

An example of one of the variables is temperature 'T2' which has the dimensions (1,1015,1359) being the (time, south_north, west_east), respectively.

The XLAT and XLONG have the same dimensions (1,1015,1359).

There was an identical question asked (please see netcdf4 extract for subset of lat lon), because my lat/long dimensions are a little different the script did not work for me and I haven't been able to figure out why. I tried to change the coordinates into 1D arrays, so that it would be similar to the previous question, but the script does not work and I get an indexing error.

If someone could please help me out that would be awesome! Thanks in advance :)

import numpy as np
from netCDF4 import Dataset  
import matplotlib.pyplot as plt

lons = b.variables['XLONG'][:]
lats = b.variables['XLAT'][:]

lons2d =lons.reshape((1015,1359))
lons1d = lons2d.reshape((1379385))

lats2d =lats.reshape((1015,1359))
lats1d = lats2d.reshape((1379385))

lat_bnds, lon_bnds = [49,53], [-125,-115]
lat_inds = np.where((lats1d > lat_bnds[0]) & (lats1d < lat_bnds[1]))
lon_inds = np.where((lons1d > lon_bnds[0]) & (lons1d < lon_bnds[1]))

T_subset = a.variables['T2'][:,lat_inds,lon_inds]

However I get the following error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-0f8890d3b1c5> in <module>()
 25 lon_inds = np.where((lons1d > lon_bnds[0]) & (lons1d < lon_bnds[1]))
 26 
---> 27 T_subset = a.variables['T2'][:,lat_inds,lon_inds]
 28 
 29 

netCDF4/_netCDF4.pyx in      netCDF4._netCDF4.Variable.__getitem__(netCDF4/_netCDF4.c:35672)()

/Users/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/netCDF4/utils.pyc in _StartCountStride(elem, shape, dimensions, grp, datashape, put)
197         # Raise error if multidimensional indexing is used.
198         if ea.ndim > 1:
--> 199             raise IndexError("Index cannot be multidimensional")
200         # set unlim to True if dimension is unlimited and put==True
201         # (called from __setitem__)
IndexError: Index cannot be multidimensional
Community
  • 1
  • 1
jdiction
  • 141
  • 1
  • 13

2 Answers2

1

I see an obvious problem with lat_inds as it has max shape 1015*1359, but You try to use it as an index for latitude, which has size 1015. So IMO You should first find similar values for lat_inds and lon_inds, points which satisfy both lon and lat limits, and then use this array for flattened data. Something like:

uni_ind=numpy.intersect1d(lat_inds,lon_inds)
T_subset=np.ravel(a.variables['T2'])[uni_ind]

Converting array back to 2D may contain some more issues, because I assume Your original data is not in cylindrical coordinates and thus the resulting subset may not be rectangular. This code is not tested, if You share the original data files, I could do that as well.

EDIT: For correct plotting it is easier to use masking, this example should be informative enough.

import numpy as np
from netCDF4 import Dataset
import matplotlib.pyplot as plt

b = Dataset('wrfout_conus_constants.nc')
a = Dataset('wrf2d_d01_2010-01-11_000000')

## Data coords
xlong = b.variables['XLONG'][0]
xlat = b.variables['XLAT'][0]
## Data var
temp = a.variables['T2'][0]
## Data bounds
longmax, longmin = -115, -125
latmax, latmin = 53, 49
## Mask coordinates according to bounds
latmask=np.ma.masked_where(xlat<latmin,xlat).mask+np.ma.masked_where(xlat>latmax,xlat).mask
lonmask=np.ma.masked_where(xlong<longmin,xlong).mask+np.ma.masked_where(xlong>longmax,xlat).mask
totmask = lonmask + latmask
## Show mask compared to full domain
plt.pcolormesh(totmask)
## Apply mask to data
temp_masked = np.ma.masked_where(totmask,temp)
## plot masked data
fig=plt.figure()
plt.contourf(temp_masked)
## plot full domain
fig=plt.figure()
plt.contourf(temp)
plt.show()
kakk11
  • 898
  • 8
  • 21
  • I have put the constants file (which has the XLAT and XLONG variables in it) as well as an hourly file in the following shareable dropbox folder. Thank you for the help! https://www.dropbox.com/sh/tqockwhj9yrag6k/AABcG7n0X7YU9N6aCgoW2mrWa?dl=0 – jdiction Feb 17 '16 at 21:51
  • Added working code to answer, feel free to ask if anything remains unclear. – kakk11 Feb 19 '16 at 09:19
1

I'm not sure why it's not working, but I think this does what you want and is cleaner:

import numpy as np
from netCDF4 import Dataset
import matplotlib.pyplot as plt

# By indexing at 0 along first dimension, we eliminate the time
# dimension, which only had size 0 anyway.
lons = b.variables['XLONG'][0]
lats = b.variables['XLAT'][0]
temp = a.variables['T2'][0]

lat_bnds, lon_bnds = [49,53], [-125,-115]

# Just AND together all of them and make a big mask
subset = ((lats > lat_bnds[0]) & (lats < lat_bnds[1]) & 
          (lons > lon_bnds[0]) & (lons < lon_bnds[1]))

# Apply mask--should apply to trailing dimensions...I think
T_subset = temp[subset]
DopplerShift
  • 5,472
  • 1
  • 21
  • 20
  • Unfortunately, I still get an error at the line --> Index error: Boolean array must have the same shape as the data along this dimension. T_subset = a.variables['T2'][subset] – jdiction Feb 17 '16 at 21:40
  • I've edited my code above to work with the data you posted. Note that the manipulation of lats/lons is no longer needed and can be used directly. – DopplerShift Feb 18 '16 at 16:54
  • I used the code and it worked to subset the data, but my issue now is that I cannot plot it with plt.contourf, because the array is no longer 2D. Is there a way I can fix that? Cheers. – jdiction Feb 18 '16 at 23:31
  • One option is to use matplotlib's `tricontour` function, which doesn't require a 2D grid (for that you can subset `lons` and `lats` using `subset`. The other option is to pick rows/columns to do a slice of the original data instead of masking. Even putting aside the challenges of using boolean indexing, the problem is that using bounds on true 2D lons and lats will pull out a trapezoid of data, not a rectangle. – DopplerShift Feb 19 '16 at 21:02