11

I'd like to understand what is causing the warning messages that I'm getting in the following scenario:

In an earlier operation I've created some NetCDF files and saved them to disk using xarray.to_netcdf().

Lazy evaluation of these datasets is perfectly fine in a jupyter notebook and I receive no warnings/errors when:

  • opening these .nc files via ds = xarray.open_mfdataset('/path/to/files/*.nc')
  • loading dimension data into memory via ds.time.values
  • lazy selection via ds.sel(time=starttime)

I seem to be able to do everything that I want to do in making calculations on memory loaded data. However I often receive the same set of errors when:

  • loading data to plot via ds.sel(time=starttime).SCALAR_DATA.plot()
  • extracting/loading data via ts = pd.Series(ds.SCALAR_DATA.loc[:,y,x], index=other_data.index)

Note that despite these warnings the operations I perform do result in the desired outcomes (plots, timeseries structures, etc.).

The common denominator in generating the following error seems to be loading data from the opened dataset. EDIT: It seems after some further experimentation that the package versions in my working environment may be causing some conflicts among those dependent on HDF5.

The following errors repeat some number of times.

HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 1:
  #000: H5A.c line 528 in H5Aopen_by_name(): can't open attribute
    major: Attribute
    minor: Can't open object
  #001: H5VLcallback.c line 1091 in H5VL_attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #002: H5VLcallback.c line 1058 in H5VL__attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #003: H5VLnative_attr.c line 130 in H5VL__native_attr_open(): can't open attribute
    major: Attribute
    minor: Can't open object
  #004: H5Aint.c line 545 in H5A__open_by_name(): unable to load attribute info from object header
    major: Attribute
    minor: Unable to initialize object
  #005: H5Oattribute.c line 494 in H5O__attr_open_by_name(): can't locate attribute: '_QuantizeBitGroomNumberOfSignificantDigits'
    major: Attribute
    minor: Object not found

...

HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 2:
  #000: H5A.c line 528 in H5Aopen_by_name(): can't open attribute
    major: Attribute
    minor: Can't open object
  #001: H5VLcallback.c line 1091 in H5VL_attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #002: H5VLcallback.c line 1058 in H5VL__attr_open(): attribute open failed
    major: Virtual Object Layer
    minor: Can't open object
  #003: H5VLnative_attr.c line 130 in H5VL__native_attr_open(): can't open attribute
    major: Attribute
    minor: Can't open object
  #004: H5Aint.c line 545 in H5A__open_by_name(): unable to load attribute info from object header
    major: Attribute
    minor: Unable to initialize object
  #005: H5Oattribute.c line 476 in H5O__attr_open_by_name(): can't open attribute
    major: Attribute
    minor: Can't open object
  #006: H5Adense.c line 394 in H5A__dense_open(): can't locate attribute in name index
    major: Attribute
    minor: Object not found

Any suggestions on what might be causing these would be greatly appreciated.

jpolly
  • 141
  • 9
  • just to be clear these are warnings, not exceptions? do you have a logger enabled or anything like that or are these just spitting out at you unprompted? I think we'll need a full [mre] here unfortunately - at least - I've never seen these before. any chance this is reproducible with a small code-generated dataset? – Michael Delgado Jun 30 '22 at 22:06
  • Good question, these may be exceptions, but are definitely appearing unprompted with no loggers or other requests being made. While I understand the value in a minimal reproducible example, I'm beginning to think that there may be some package version compatibility issues among HDF5 and it's dependents within my working environment. – jpolly Jul 01 '22 at 13:00
  • The warnings shown have gone away when letting conda solve all the dependencies of packages within my environment. Previously I was manually pip installing most of the packages (xarray, netcdf4, rioxarray, etc.) in my environment. This approach resulted in the errors described above. I don't know if this constitutes an "answer" to the question, but conda installing these packages has fixed the issue, resulting in no warnings. – jpolly Jul 01 '22 at 19:37
  • Yeah that’s what I would have suggested. Note that installing them all at once means they preferentially were selected from compatible channels as well as versions, so conda is ensuring you have consistent compiler flags and versions across packages. – Michael Delgado Jul 01 '22 at 21:20
  • Any update on this, folks? I have the same issue; the code works fine but lots of these messages. I have individually installed all the geo package libraries (`C, C++, NC, HDF4, HDF5`...) on `CentOS 7.`9, and `Python 3.9` through `PiP`. Thanks – PDash Aug 08 '22 at 07:54
  • Letting conda solve the dependencies between the various packages really ended up being the solution that got rid of these warnings for me. When I'd manually installed all the various packages on top of one another, without carefully specifying versions, the warnings persisted. – jpolly Aug 09 '22 at 20:29
  • 1
    It looks like you can see this issue even when using conda if you have conflicting channels in your environment, or in your base environment, such as if you installed using `anaconda`. See this question (also linked in jpolly’s answer): [HDF5 error when opening NC files in python with xarray](https://stackoverflow.com/a/74248405/2258298) – Michael Delgado Nov 03 '22 at 15:29

3 Answers3

6

These warnings could be caused by netcdf4 version 1.6.X.

Downgrading to netcdf4=1.5.8 fixed the issue in my case.

See also https://github.com/SciTools/iris/issues/5187

ndou
  • 1,048
  • 10
  • 15
5

I was struggling with very similar errors the past few days and eventually discovered that restricting my dask client to use 1 thread per worker solved the problem, i.e.,:

import xrarray as xr
from dask.distributed import Client
c = Client(n_workers=os.cpu_count()-2, threads_per_worker=1)

ds = xr.open_mfdataset('/path/to/files/*.nc')
ds.sel(.... )

worth a shot if jpolly's solution doesn't work for you (in my case, I'm not using conda...)

chris
  • 1,267
  • 7
  • 20
2

Letting conda solve the dependencies between the various packages really ended up being the solution that got rid of these warnings for me.

When I'd manually installed all the various packages on top of one another, without carefully specifying versions or letting conda solve the dependencies, the warnings persisted.

EDIT: There is a nice explanation of this in this answer.

jpolly
  • 141
  • 9