0

I wish to subset my xarray Dataset via a list of variable names. However, when I do so, the resultant Dataset no longer has the coordinate reference information, as evidenced by adding the subset as a layer in QGIS.

How can I keep the coordinate reference information after subsetting the original Dataset?

import xarray as xr

DS = xr.open_dataset("my_data.nc")
bands = ['CMI_C01','CMI_C02','CMI_C03']

# Test does not have coordinate reference information :(
test = DS[bands]

It is apparent that the coordinate reference information is not stored in the .coords attribute, due to the following not working:

# Test still does not have coordinate reference info
test = test.assign_coords(dict(DS.coords))

# When put into QGIS, does not have the CRS
test.to_netcdf("test.nc")

Where is the CRS stored for xarray Datasets?


For background, I am using GOES imagery from the public AWS s3 bucket.

This is what the original Dataset looks like:

Dimensions:                                 (y: 1500, x: 2500,
                                             number_of_time_bounds: 2,
                                             number_of_image_bounds: 2, band: 1)
Coordinates: (3/37)
* t                                       datetime64[ns] 2017-03-04T08:38:0...
* y                                       (y) float32 0.1265 ... 0.04259
* x                                       (x) float32 -0.07501 ... 0.06493.47
   

Attributes: (2/29)
    naming_authority:          gov.nesdis.noaa
    Conventions:               CF-1.7
Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
Sean Carter
  • 121
  • 8

1 Answers1

1

coordinates in xarray refer to the dimension labels, and have nothing to do with spatial coordinate reference system metadata.

You're looking for xarray Attributes. These can be accessed with .attrs, and you can carry over attributes from one dataset to another with:

test.attrs.update(DS.attrs)

You can carry over attributes within variables in a similar way:

test[varname].attrs.update(DS[varname].attrs)

As an example, after computing a simple operation which does not change the set of data variables or coordinates, you could do the following:

# simple operation, which removes all attributes but does not change
# the dataset's structure
ds = orig_ds * 2

ds.attrs.update(orig_ds.attrs)
for c in ds.coords.keys():
    ds[c].attrs.update(ds_orig[c].attrs)
for v in ds.data_vars.keys():
    ds[v].attrs.update(ds_orig[v].attrs)

Note that xarray does not explicitly handle CRS information ever, and additionally does not preserve attributes in computations by default. You can change this behavior to keep attributes across computation steps by default with:

xr.set_options(keep_attrs=True)

See the FAQ section: What is your approach to metadata? for more information. Also see the docs on Data Structures for more detail on the various xarray objects.

Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
  • Many thanks for the information. I now see that xarray does not explicitly handle metadata, so selecting bands must also include ways to update the attributes of the selection. However, trying your `test.attrs.update(DS.attrs)` method still did not yield a raster file with the correct coordinate reference information – Sean Carter Jul 29 '22 at 19:02
  • if the attributes are on a variable within the dataset, be sure to update those as well, e.g. `test[variable].attrs.update(source_ds[variable].attrs)`. this does work, so if you have a specific workflow that's not working feel free to update your question or ask another one - it could be something about the way you're doing it that neads a tweak. but I use this all the time to preserve attributes. you can also use the set_options method. – Michael Delgado Jul 29 '22 at 19:05
  • 1
    Many thanks for the information. Indeed, the QGIS-readable geographic information of this particular data product was stored as a variable known as `'goes_imager_projection'`, so including this variable in the `bands` list allowed QGIS to recognize the geographic information associated with this product. – Sean Carter Aug 01 '22 at 19:14