0

When accessing remote datasets with xarray.open_dataset, warnings or errors sometimes show up in my console.

Capturing messages written to sys.stderr or sys.stdout is certainly possible (or one can also add some warnings management as @MichaelDelgado suggested):

import sys
from io import StringIO
import xarray as xr

url = "http://nomads.ncep.noaa.gov:80/dods/gefs/gefs20220909/gec00_00z_pgrb2a"

try:
    sys.stderr = err = StringIO()
    ds = xr.open_dataset(url_not_exist)
    print(ds.dims)
except OSError as ose:
    print(f"Error Log = {ose}", file=sys.stderr)
finally:
    sys.stderr = sys.__stderr__
    if err.tell() > 0:
        err.seek(0)
        print(f"Additional Info: " + err.read())

->

Frozen({'ens': 1, 'time': 65, 'lev': 12, 'lat': 361, 'lon': 720})
Additional Info: /home/workspace/.venv/lib/python3.8/site-packages/xarray/coding/times.py:144: SerializationWarning: Ambiguous reference date string: 1-1-1 00:00:0.0. The first value is assumed to be the year hence will be padded with zeros to remove the ambiguity (the padded reference date string is: 0001-1-1 00:00:0.0). To remove this message, remove the ambiguity by padding your reference date strings with zeros.
  warnings.warn(warning_msg, SerializationWarning)

Note: Be aware that the URL I used only is valid for some days. if you want to try it out, change the date that the path contains (/gefs.../) to a more recent one.

But...

In other cases redirecting stderr has no effect. For example, when the url references a non-existent resource. Setting url = "http://nomads.ncep.noaa.gov:80/dods/gefs/DOES_NOT_EXIST/gec00_00z_pgrb2a" and executing the same code gives:

oc_open: server error retrieving url: code=0 message="/gefs/asdf/gec00_00z_pgrb2a is not an available dataset"
Additional Info: Error Log = [Errno -70] NetCDF: DAP server error: b'http://nomads.ncep.noaa.gov:80/dods/gefs/asdf/gec00_00z_pgrb2a'

As you can see, some info has been written to the console and one has no handle on it. It is not written to stdout or stderr and this can potentially clutter the logs or just be missing in the actual logs.

I'd like to understand what happens here. Is xarray spawning a subprocess that has no idea about the sys.stderr situation? How can I properly log these messages or at least avoid them to clutter the console?

FlorianK
  • 71
  • 6
  • Is it possible that that error is actually being output to `stdout`? – pigrammer Sep 09 '22 at 18:50
  • does normal [warnings management](https://stackoverflow.com/questions/14463277/how-to-disable-python-warnings) not do the trick? the error is a properly raise warning: see [`xarray/coding/times.py`](https://github.com/pydata/xarray/blob/main/xarray/coding/times.py#L143-L150) – Michael Delgado Sep 09 '22 at 20:54
  • @pigrammer, I did also re-route `stout` for testing but no success... – FlorianK Sep 12 '22 at 01:29
  • @MichaelDelgado - I did some testing and can now with certainty say that the code example handles messages that are written to `stderr` and `stdout` correctly. With that it also captures the messages in e.g. [times.py](https://github.com/pydata/xarray/blob/main/xarray/coding/times.py#L143-L150). Other messages must show up in the console through some other route. I am clarifying this in my question... – FlorianK Sep 12 '22 at 11:58

0 Answers0