2

I noticed that depending on the size of the queried area using CDS API (more specifically the cdsapi Python library), I receive slightly different values of precipitations for the same coordinates.

Let's take an example: I want to get daily precipitation of 2009-11-30 at coordinates (9.75 lat, 122.75 lon) local time, which makes me query for the range 2009-11-29 to 2009-11-30 and make 8 hours shift in case someone would want to reproduce it.

In bbox = [12.50, 118.00, 7.75, 125.50] the value at (9.75 lat, 122.75 lon) is 0.000308474.

In bbox = [10.50, 122.50, 9.50, 125.00] the value at (9.75 lat, 122.75 lon) is 0.000308558.

Both of the requests are snapped to a 0.25 grid so I would expect that there is no difference between them. Of course, we are talking about 1/1000 of a millimeter here, but it would mess with my tests for data consistency.

Do you know what can be a reason for that? Is it just caused by a common problem with float inaccuracy?

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
KJarocki
  • 53
  • 6
  • I don't think we can help, maybe a question for the API provider.. but if you're asking for the average rainfall in two different areas, is it not reasonable for it to give you two different answers? – JeffUK Jul 19 '21 at 16:47
  • ERA5 provides value for every point of the grid snapped to 0.25 lat and long, it is not a request for average precipition, it is a request for all of the points with precipitation values for given area. There are two queries here in fact – first, ask for a chunk of data from the server at some part of the world (bbox) and after that – check what is the value in specific coordinates (9.75 lat, 122.75 lon), that both of the bboxes contain. It is just value array with coordinates pair contain in the bbox – in first case it will be (12.50, 118.00), (12.75, 118.00), (13.00,118) and so on. – KJarocki Jul 19 '21 at 20:56
  • I will edit main example a little because indeed it is not very clear, thanks for pointing that out. – KJarocki Jul 19 '21 at 21:01
  • If the question boils down to 'Give x requests, this third party API returns y response' then it's not really a programming question. Does the API give the same result if you call it directly using something like POSTMAN to construct the requests? Either way, probably best to look for a support contact on the CDS website. – JeffUK Jul 20 '21 at 08:08
  • Indeed, it is more a question about the way API works, thank you for your advices. For an update for people with same concern, answer from provider: "The data values in netCDF files from the CDS are 'packed' using a scale factor and an offset (with some loss of precision). These packing values will vary depending on the range of actual data values. The 2 different selected areas will very likely have different min/max values, hence different scale factor and offset values, and I suspect that these lead to the numerical differences you see when the data are unpacked to get the data values." – KJarocki Jul 21 '21 at 08:06
  • I would suggest you put this information in as an answer and accept it. – adr Aug 03 '21 at 12:13

1 Answers1

2

answer from provider: "The data values in netCDF files from the CDS are 'packed' using a scale factor and an offset (with some loss of precision). These packing values will vary depending on the range of actual data values. The 2 different selected areas will very likely have different min/max values, hence different scale factor and offset values, and I suspect that these lead to the numerical differences you see when the data are unpacked to get the data values.

KJarocki
  • 53
  • 6
  • this is useful - did you try the request in CDSAPI downloading the original GRIB file (instead of requesting format=NETCDF) and then extracting the point with e.g. cdo ? Do you get the same answer then ? – ClimateUnboxed Aug 23 '21 at 07:43