1

I am currently working on a project that involves climate model data stored in a NetCDF file. I am currently trying to calculate "weighted" spatial annual "global" averages for precipitation. I need to do this for each of the 95 years of global precipitation data that I have. The idea would be to somehow apply weights to each grid cell by using the cosine of its latitude (which means latitude grid cells at the equator would have a weight of 1 (i.e. the cosine of 0 degrees is 1), and the poles would have a value of 1 (as the cosine of 90 is 1)). Then, I would be in a position to calculate annual weighted averages based on averaging each grid cell.

I have an idea how to do this conceptually, but I am not sure where to begin writing a script in R to apply the weights across all grid cells and then average these for each of the 95 years. I would greatly appreciate any help with this, or any resources that may be helpful!!!

At the very least, I have opened the .nc file and read-in the NetCDF variables, as shown below:

ncfname<-"MaxPrecCCCMACanESM2rcp45.nc"
Prec<-raster(ncfname)
print(Prec)
Model<-nc_open(ncfname)
get<-ncvar_get(Model,"onedaymax")
longitude<-ncvar_get(Model, "lon")
latitude<-ncvar_get(Model, "lat")
Year<-ncvar_get(Model, "Year")

Additionally, let's say that I wanted to create a time series of these newly derived weighted averaged for a specific location or region, the following code that I previously used to show trends over the 95 years for one-day maximum precipitation works, but I would just need to change it slightly to use the annual weighted means? :

r_brick<-brick(get, xmn=min(latitude), xmx=max(latitude), ymn=min(longitude),                               
ymx=max(longitude), crs=CRS("+proj=longlat +ellps=WGS84 +datum=WGS84   
+no_defs+ towgs84=0,0,0"))
r_brick<-flip(t(r_brick), direction='y')
randompointlon<-13.178
randompointlat<--59.548
Random<-extract(r_brick, 
SpatialPoints(cbind(randompointlon,randompointlat)),method='simple')
df<-data.frame(year=seq(from=1, to=95, by=1), Precipitation=t(Hope))
ggplot(data=df, aes(x=Year, y=Precipitation,   
group=1))+geom_line()+ggtitle("One-day maximum precipitation (mm/day) trend  
for Barbados for CanESM2 RCP4.5")

Also, if it helps, here is what the .nc file contains:

 3 variables (excluding dimension variables):
    double onedaymax[lon,lat,time]   (Contiguous storage)  
        units: mm/day
    double fivedaymax[lon,lat,time]   (Contiguous storage)  
        units: mm/day
    short Year[time]   (Contiguous storage)  

 3 dimensions:
    time  Size:95
    lat  Size:64
        units: degree North
    lon  Size:128
        units: degree East

Again, any assistance would be extremely valuable with this! I look forward to your response!

Android17
  • 93
  • 1
  • 12

2 Answers2

1

Please ask one clear question at a time, and provide example data (through code).

I do not think you go about reading the ncdf data the right way. I think you should do

library(raster)
ncfname <- "MaxPrecCCCMACanESM2rcp45.nc"
Prec <- brick(ncfname, var="onedaymax")

(do not use nc_open etc)

To get a global weighted average

Example data

library(raster)
r <- abs(init(raster(), 'y'))
s <- stack(r, r, r)

s is a RasterStack with value 90 at the poles and 0 at the equator

The unweighted global mean. First average the layers, then the cells (reverse order would also work in this case)

sm <- mean(s, na.rm=TRUE)
cellStats(sm, mean, na.rm=TRUE)
[1] 45

Now use weighting (to get a lower number is the high latitudes get less weight)

# raster with latitude cell values 
w <- init(s, 'y')
# cosine after transforming to radians
w <- cos(w  * (pi/180))
# multiply weights with values
x <- sm * w
# compute weighted average
cellStats(x, sum) / cellStats(w, sum)
#[1] 32.70567

An alternative, and perhaps simpler, solution is uses the area of each cell (which is proportional to cos(lat)). The result is perhaps a little bit more precise (as area does not only considering the cell center).

a <- area(s) / 10000
y <- sm * a
cellStats(y, sum) / cellStats(a, sum)
#[1] 32.72697

Later:

For a time series, just use s.

unweighted

cellStats(s, mean) 
#layer.1 layer.2 layer.3 
# 45      45      45 

weighted

a <- area(s) / 10000
y <- s * a
cellStats(y, sum) / cellStats(a, sum)
# layer.1  layer.2  layer.3 
#32.72697 32.72697 32.72697 
Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
  • Hi Robert, Thank you for this is extremely helpful response! I just wanted to know if this procedure would provide a weighted average across all grid cells by averaging all 95 years of data? For example, would it provide a 95-year weighted average across each grid cell, or would it compute a "global" weighted average by averaging all 95 years (i.e. the 95-year global average)? Also, could you help me better understand what is behind the variable "s" and "w"? – Android17 Mar 19 '19 at 04:49
  • This produces a global weighted average, as you can see when you run the code. It is true that I did not explain much --- but you can look at the values of the objects and/or read the help files if the functions that create them. – Robert Hijmans Mar 19 '19 at 16:05
  • Hi Robert, Thanks, again, for your response. I tried running both methods, but is the value that I am seeing, from object "x", the global mean over the 95 years? Based on cellStats(x, mean), the value is 1.27 mm/day, although this differs slightly when I run the other method that you had suggested (1.56 mm/day). – Android17 Mar 19 '19 at 17:49
  • I have edited my answer to make it more clear (I hope) – Robert Hijmans Mar 19 '19 at 19:51
  • Hi Robert, Many thanks! Yes, the edit really helps. I just wanted to clarify, but does this compute the global mean for Year 1, or would this be averaging all 95 years? Finally, I would use the variable "Prec" within the raster function to read in the dataset, correct? I apologize for all of the questions, but I just wanted to be sure that I understand everything correctly! Thanks, again! – Android17 Mar 19 '19 at 21:04
  • It is the mean for all layers (years in your case). If you replace example data set `s` with `Prec` (created with `Prec <- brick(ncfname, var="onedaymax")` it should work. – Robert Hijmans Mar 19 '19 at 21:49
  • Hi Robert, I tried replacing "s" with "Prec", but I am receiving the same values as what you showed at each stage of cellStats instead of my own (i.e. 45 and then 32.70567). Perhaps because replacing "s" with "Prec" merely removes the contents of Prec by making it a raster stack? Could it be that "sm" should instead be replaced by Prec? – Android17 Mar 19 '19 at 23:21
  • Hi Robert, just to let you know that I think that I was able to run it successfully and plot the averages of the resultant single layer on a map - it looks right. :) Just a couple of small questions, but is it possible, with this new dataset of weighted globally annual averages, to derive a specific region's or location's weighted average by specifying latitude and longitude coordinates, just to see what that region or location's computed 95-year average is? – Android17 Mar 20 '19 at 16:09
  • I think the local weighted average is the same as the unweighted average (the normal mean), as a location has a single weight. So `sm` shows that. – Robert Hijmans Mar 20 '19 at 23:23
  • Ah, I see. But let's say that I wanted to verify that, is there a way to isolate a location or region to see what the 95-year average (weighted or unweighted) would be there? For example, if I wanted to have r compute the average across 20.5 to 23.2 degrees North and 80.3 to 85.5 degrees West, could it compute something like that? – Android17 Mar 21 '19 at 01:32
  • See: `?crop`, `?extract`, `?click`, (or ask a new question) – Robert Hijmans Mar 21 '19 at 04:07
  • Excellent! Yes, I tried both the extract and crop functions, and they work well for my purposes, showing the maximum and minimum values over regions of interest! There are actually some subtle differences when comparing unweighted and weighted averages for a selected location, but it is not substantial. Just one last question: I don't suppose it's possible to create a time series (global/local) of these newly derived average values? Probably not, though, since these values are now on a single layer that was created (as opposed to 95 separate layers), making it difficult to make a time series? – Android17 Mar 21 '19 at 15:45
  • I think it is simple. Use `s` instead of `sm` – Robert Hijmans Mar 22 '19 at 00:33
  • But "s", or "Prec" in this case, would not yield a 95-year time series of global/regional weighted averages, would it? – Android17 Mar 22 '19 at 05:13
  • No luck, unfortunately. But since this is now outside the boundary of the original question, it might be appropriate to start a new question. :) Thanks for all of your help with this, Robert - much appreciated! – Android17 Mar 22 '19 at 22:45
  • If you do *not* compute `sm` and replace all the references to `sm` with `s` you get what you are looking for. – Robert Hijmans Mar 22 '19 at 23:56
  • Hi Robert - it's just that I would like to somehow use "sm" (which is the new single-layer raster dataset of unweighted averages) and/or "x" (which is the new single-layer dataset of weighted averages) to create a time series (i.e. a time series of unweighted or weighted averages). The problem is that both "sm" and "x" are now one raster layer (as opposed to the original 95 layers), so I don't think that a time series is possible with a single layer. "s" or "Prec" would be the original 95-layer datasets before the weighted or unweighted averages were computed. Does that make sense? – Android17 Mar 23 '19 at 01:33
0

Not that I want to pull you away from R, but this sort of calculation is the absolute bread and butter of cdo (climate data operators) straight from the command line!

Calculate the spatial weighted mean (this accounts for the sin of latitude and can also handle reduced Gaussian grids etc) :

cdo fldmean input.nc fldmean.nc 

Calculate the annual mean

cdo yearmean input.nc yearmean.nc

Calculate a timeseries of annual global means by combining the two (i.e. use fldmean.nc as the input for the second command), or you can do on one line by piping:

cdo yearmean -fldmean input.nc yearglobal.nc

What's that? You want to calculate it for a lat-lon box region you say, not global averages? No problem, use sellonlatbox first to cut out an area

cdo sellonlatbox,lon1,lon2,lat1,lat2 in.nc out.nc 

so piping this:

cdo  yearmean -fldmean -sellonlatbox,lon1,lon2,lat1,lat2 in.nc yearregion.nc

But wait! you now want a specific location, not a region average? Well you can pick out the nearest gridbox to a location with

cdo remapnn,lon=mylon/lat=mylat in.nc out.nc 

so you can get your series of annual averaged values there with:

cdo yearmean -remapnn,lon=mylon/lat=mylat in.nc yearmylocation.nc

the possibilities are many... install it with

sudo apt install cdo 

and take a look at the documentation here: https://code.mpimet.mpg.de/projects/cdo/wiki/Cdo#Documentation

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
  • Hi Adrian, Thank you for your input - I will definitely consider cdo, but could what you suggested above similarly work in R? – Android17 Mar 19 '19 at 18:07
  • well you can run cdo using the system command in R https://stackoverflow.com/questions/37910749/is-it-possible-to-run-cdo-within-a-r-script or there seems to be a package on github that provides an interface although I've never tried it https://rdrr.io/github/ecor/cdor/ but why would you want to do this when you can simply manipulate the files from the command line? If you are on windows 10 it is straightforward to install ubuntu directly now so you can access all the linux-based utilities such as cdo easily. – ClimateUnboxed Mar 19 '19 at 20:18
  • Hi Adrian, thank you very much for your suggestion - I will give this a try, as well, just for the fun of it, as it does sound straight-forward! Yes, I am running on Windows 10. And yes, Robert's answer(s) is definitely helpful and have, thus, upvoted it! – Android17 Mar 20 '19 at 00:54
  • Hi Adrian, is there any way to give you an upvote, as well? It is just that I found your suggestion useful, too, and so would like to show some appreciation! :) – Android17 Mar 20 '19 at 16:32
  • just click the up triangle above the zero - same for Robert. – ClimateUnboxed Mar 21 '19 at 08:51