4

I've a NetCDF file in which the variables are stored in 0 to 360-degrees longitude. I would like to convert it to -180 to 180 degrees. This should be a rather straightforward task but for some reason I can't seem to make some of the examples given in the tutorial work out.

ds = xr.open_dataset(file_)   
>ds
<xarray.Dataset>
Dimensions:  (lev: 1, lon: 720, time: 1460)
Coordinates:
* lon      (lon) float64 0.0 0.5 1.0 1.5 2.0 2.5 ... -2.5 -2.0 -1.5 -1.0 -0.5
* lev      (lev) float32 1.0
* time     (time) datetime64[ns] 2001-01-01 ... 2001-12-31T18:00:00
Data variables:
 V        (time, lev, lon) float32 13.281297 11.417505 ... -19.312767

I try using the help of Dataset.assign_coord

ds.V.assign_coords(lon=((ds.V.lon + 180) % 360 - 180)) 
#gives me a new array with lon -180 to 180
ds['V'] = ds.V.assign_coords(lon=((ds.V.lon + 180) % 360 - 180))
# didn't modify the V for some reason?

So, assign_coords worked but setting the variable back to Dataset doesn't work. After many tries, I figured to directly modify the coordinates "lon" because they're linked to the Datavariable "V" via dictionary.

ds.coords['lon'] = (ds.coords['lon'] + 180) % 360 - 180
#solves the problem!

Second Problem I encountered is in sorting my data variable according to the above-modified longitudes. I tried

 ds['V'] = ds.V.sortby(ds.lon)
 >ds.V 

 # the array is not sorted according to -180 to 180 values

But when I sort the dataset and assign it, it works.

ds = ds.sortby(ds.lon) # now my dataset is sorted to -180 to 180 degrees lon

It would be very helpful for my understanding of xarrays if someone can point out why my first approach for both problems are not working?

Light_B
  • 1,660
  • 1
  • 14
  • 28

4 Answers4

12

I apologise for the one-liner but this is exactly how I have solved this issue: d = d.assign_coords(longitude=(((d.longitude + 180) % 360) - 180)).sortby('longitude') you should work at Dataset level and not at DataArray.

Matteo De Felice
  • 1,488
  • 9
  • 23
  • Thanks for the suggestion. If there were other coordinates in my data like latitude or pressure levels then, with dataset.assign_coords would it delete the other coordinates automatically and just assign longitudes as the new coordinates? – Light_B Nov 05 '18 at 10:41
  • 1
    In the xarray documentation you can read for assign_coords: "Returns a new object with all the original data in addition to the new coordinates.". Then, the other coordinates will not change, in this specific case you just "overwrite" longitude with a modified version... – Matteo De Felice Nov 05 '18 at 13:21
3

There's one principle that explains why both of your initial approaches didn't work. In a Dataset, variables have values along coordinates. The coordinates have a separate existence in the Dataset from the variables. You may have three variables U, V, and W which all vary along some coordinate longitude within the dataset. On their own, it's fine for U and V to have their longitude values in different orders, but within the dataset they must have the same ordering.

When you assign a variable to a dataset where the dataset already has the coordinate of the variable, xarray will automatically re-order that variable to have the same ordering as the dataset. It will also do nice things like add nan values wherever the variable does not have values for a given coordinate in the dataset.

Here's an example where I've made a Dataset and DataArray that both have a longitude coordinate, but in reversed directions. When I assign the DataArray to the Dataset, the coordinate is automatically reversed.

In[17]: ds
Out[17]: 
<xarray.Dataset>
Dimensions:    (longitude: 10)
Coordinates:
  * longitude  (longitude) float64 360.0 320.0 280.0 240.0 200.0 160.0 120.0 ...
Data variables:
    *empty*

In [18]: da
Out[18]: 
<xarray.DataArray (longitude: 10)>
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
Coordinates:
  * longitude  (longitude) float64 0.0 40.0 80.0 120.0 160.0 200.0 240.0 ...

In [19]: ds['v'] = da

In [20]: ds['v']
Out[20]: 
<xarray.DataArray 'v' (longitude: 10)>
array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
Coordinates:
  * longitude  (longitude) float64 360.0 320.0 280.0 240.0 200.0 160.0 120.0 ...

Here's a similar example where it adds nan automatically:

In [27]: ds
Out[27]: 
<xarray.Dataset>
Dimensions:    (longitude: 10)
Coordinates:
  * longitude  (longitude) float64 360.0 320.0 280.0 240.0 200.0 160.0 120.0 ...
Data variables:
    *empty*

In [28]: da
Out[28]: 
<xarray.DataArray (longitude: 3)>
array([ 0.,  0.,  0.])
Coordinates:
  * longitude  (longitude) float64 0.0 40.0 80.0

In [29]: ds['v'] = da

In [30]: ds['v']
Out[30]: 
<xarray.DataArray 'v' (longitude: 10)>
array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,   0.,   0.,   0.])
Coordinates:
  * longitude  (longitude) float64 360.0 320.0 280.0 240.0 200.0 160.0 120.0 ...
Jeremy McGibbon
  • 3,527
  • 14
  • 22
  • Thanks, I got confused as the document said that coordinates are not stored as an ordered Dictionary which let me to believe that each variable is separately linked to the coordinates via a dictionary. Of course, now I see that in a dataset that won't make sense – Light_B Nov 05 '18 at 10:31
2

It is not a python solution, but if you are on linux and have nco you can type

ncap2 -O -s 'where(lon>180) lon=lon-360' ifile ofile

as per this answer here How to change longitude range in a NetCDF

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
  • Sure, as I acknowledged at the start of my post, but perhaps the OP (or others searching for a solution to this issue, but may not be wedded to python) doesn't know there is a one liner shell alternative - I don't think it harms to add a short efficient alternative to the list of python answers (which I upvoted) and never understand for this reason why people on SO vote down answers that address the question but not in the language requested. Answers are not just for the OP but for the general community. – ClimateUnboxed Nov 04 '18 at 21:52
  • Sure, it's also good for me to know that it's possible in this way too. I'm assuming doing it in nco would be faster than in python for a larger dataset? – Light_B Nov 05 '18 at 10:33
  • 1
    hi light_B, thanks for your positive comment. If you wanted to open the file, change the range, and then do further processing in python, it's probably faster in python, since with nco/cdo solutions you are opening, reading, writing to disk and then you would be opening again and reading in python to do your processing. If you want to make the change once, and then repeatedly use that file, then possibly more efficient to do it this way. I'm not an expert to be honest. However, the nco/cdo solution is usually always more efficient in terms of *your* time, which is often more important ;-) – ClimateUnboxed Nov 05 '18 at 13:51
1

cdo is OK and quick for those problems, Like :

cdo sellonlatbox,-180,180,-90,90 a.nc b.nc

a.nc is your data, and b.nc is the result you want.

shen159876
  • 11
  • 1