1

I am using the following Package:Phenor (https://github.com/bluegreen-labs/phenor). Within this package there is a function (format_eobs) which formats climatological data into a specific format for ingestion into Phenor. I have created my own climatology data that I want to format for ingestion into Phenor, but I am hitting an error when trying to use the function:

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 't': error in evaluating the argument 'x' in selecting a method for function 'as.matrix': invalid layer names

I renamed the function and have stepped though it to the source line of the error:

altitude = t(raster::as.matrix(eobs_data[[3]]$layer))

When I look at the eobs_data, I don't see any variable in the list called 'layer': Screenshot of eobs_data

I tested this with the original data also and the function ran perfectly, even though there no variable called 'layer' in it either. I'm thinking it might be something to do with the fact that the data is a raster brick, but I don't have any experience of this sort of datatype.

Does anyone know why the error is occuring? I have included the code of the function below.

myfun<-function (path = "~", year = 2014, offset = 264, resolution = 0.25, 
          internal = TRUE) 
{
  eobs_data = lapply(c("tg", "rr", "elev"), function(x) {
    filename = sprintf("%s_%sdeg_reg_v16.0.nc", x, resolution)
    if (file.exists(sprintf("%s/%s", path, filename))) {
      r = raster::brick(sprintf("%s/%s", path, filename))
      return(r)
    } else {
      stop("No E-OBS files found in the referred path !")
    }
  })
  eobs_date = as.Date(eobs_data[[1]]@z$Date)
  yday = as.numeric(format(as.Date(eobs_data[[1]]@z$Date), 
                           "%j"))
  years = as.numeric(format(as.Date(eobs_data[[1]]@z$Date), 
                            "%Y"))
  layers = which((years == (year - 1) & yday >= offset) | 
                   (years == year & yday < offset))
  if (length(layers) < 365) {
    stop("The selected dataset does not cover your data range!")
  }
  Ti = t(raster::as.matrix(raster::subset(eobs_data[[1]], 
                                          layers)))
  Pi = t(raster::as.matrix(raster::subset(eobs_data[[2]], 
                                          layers)))
  altitude = t(raster::as.matrix(eobs_data[[3]]$layer))
  ext = raster::extent(eobs_data[[1]])
  proj = raster::projection(eobs_data[[1]])
  size = dim(eobs_data[[1]])
  location = sp::SpatialPoints(sp::coordinates(eobs_data[[1]]), 
                               proj4string = sp::CRS(proj))
  location = t(sp::spTransform(location, sp::CRS("+init=epsg:4326"))@coords[, 
                                                                            2:1])
  if (offset < length(layers)) {
    doy = c(offset:length(layers), 1:(offset - 1))
  } else {
    doy = 1:length(layers)
  }
  Li = lapply(location[1, ], FUN = function(x) {
    daylength(doy = doy, latitude = x)
  })
  Li = t(do.call("rbind", Li))
  data = list(site = NULL, location = location, doy = doy, 
              transition_dates = NULL, Ti = Ti, Tmini = NULL, Tmaxi = NULL, 
              Li = Li, Pi = Pi, VPDi = NULL, georeferencing = list(extent = ext, 
                                                                   projection = proj, size = size))
  class(data) = "phenor_map_data"
  if (internal) {
    return(data)
  } else {
    saveRDS(data, file = sprintf("%s/phenor_eobs_data_%s.rds", 
                                 path, year))
  }
}

The code below creates three netcdf files which can then be called by 'myfun' above. I have tested it and it creates the same error that I have described above. It just needs the path to the newly created file included in the last line.

library(tidync)
library(ncdf4)
library(ncdf4.helpers)
library(RNetCDF)
library(easyNCDF)

#create data
tmintest=array(1:100, c(420,189,1461))

#create the list
Variable <- list(Data = tmintest)
xyCoords <- list(x = seq(-40.37,64.37,length.out=420), y = seq(25.37,72.37,length.out=189))
Dates <- list(start = seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"), end=seq(as.Date("2012-01-01"), as.Date("2015-12-31"), by="days"))
All <- list(Variable = Variable, xyCoords=xyCoords,Dates=Dates)

#Write the three netcdf files with test data created above put in each. 
#The function requires 3 .nc files ("tg","r","elev")

#----------------
# Make dimensions
#----------------
xvals<- All$xyCoords$x
yvals<- All$xyCoords$y
#xvals <- 1:512
#yvals <- 1:256

nx <- length(xvals)
ny <- length(yvals)

xdim <- ncdim_def( 'lon', 'degrees_east',longname="longitude", xvals )
ydim <- ncdim_def( 'lat', 'degrees_north',longname="latitude", yvals )
#tdim <- ncdim_def( 'time', 'climatology',longname="time", 0, unlim=TRUE )
#Try and mirror what is in the climex example file to see if that allows for ingestion
tdim <- ncdim_def( 'time', 'days since 2013-01-01 00:00',longname="time", 1:1461, unlim=TRUE )

#---------
# Make var
#---------
mv <- 1.00000002004088e+20     # missing value
var_temp <- ncvar_def( 'rr', 'C', list(xdim,ydim,tdim), mv )

#---------------------
# Make new output file
#---------------------
output_fname <- 'rr_0.25deg_reg_v16.0.nc'
ncid_new <- nc_create( output_fname, list(var_temp))

#-------------------------------
# Put the rr climatology data in the file
#-------------------------------

ncvar_put( ncid_new, var_temp, All$Variable$Data, start=c(1,1,1), count=c(nx,ny,1461))
#---------
#ADD SOME MORE ATTRIBUTES
#---------
ncatt_put(ncid_new, "rr", "long_name", "Maximum Temperature 2 m above ground")
ncatt_put(ncid_new, "rr", "standard_name", "air_temperature")
ncatt_put(ncid_new, "rr", "cell_methods", "time: maximum within days time: mean over days")
ncatt_put(ncid_new, "lon", "standard_name", "longitude")
ncatt_put(ncid_new, "lat", "standard_name", "latitude")
ncatt_put(ncid_new, "time", "standard_name", "time")
ncatt_put(ncid_new, "time", "calendar", "standard")
ncatt_put(ncid_new, "rr", "var_name", "rr")

#--------------------------
# Close our new output file
#--------------------------
nc_close(ncid_new )


#===========================================================================================


#----------------
# Make dimensions
#----------------
xvals<- All$xyCoords$x
yvals<- All$xyCoords$y
#xvals <- 1:512
#yvals <- 1:256

nx <- length(xvals)
ny <- length(yvals)

xdim <- ncdim_def( 'lon', 'degrees_east',longname="longitude", xvals )
ydim <- ncdim_def( 'lat', 'degrees_north',longname="latitude", yvals )
#tdim <- ncdim_def( 'time', 'climatology',longname="time", 0, unlim=TRUE )
#Try and mirror what is in the climex example file to see if that allows for ingestion
tdim <- ncdim_def( 'time', 'days since 2013-01-01 00:00',longname="time", 1:1461, unlim=TRUE )
#---------
# Make var
#---------
mv <- 1.00000002004088e+20     # missing value
var_temp <- ncvar_def( 'elev', 'metres', list(xdim,ydim,tdim), mv )

#---------------------
# Make new output file
#---------------------
output_fname <- 'elev_0.25deg_reg_v16.0.nc'
ncid_new <- nc_create( output_fname, list(var_temp))

#-------------------------------
# Put the elev climatology data in the file
#-------------------------------
#TmaxDelta$Data[is.nan(TmaxDelta$Data)] <- 0
ncvar_put( ncid_new, var_temp, All$Variable$Data, start=c(1,1,1), count=c(nx,ny,1461))
#---------
#ADD SOME MORE ATTRIBUTES
#---------
ncatt_put(ncid_new, "elev", "long_name", "Elevation")
ncatt_put(ncid_new, "elev", "standard_name", "Elevation")
ncatt_put(ncid_new, "elev", "cell_methods", "time: maximum within days time: mean over days")
ncatt_put(ncid_new, "lon", "standard_name", "longitude")
ncatt_put(ncid_new, "lat", "standard_name", "latitude")
ncatt_put(ncid_new, "time", "standard_name", "time")
ncatt_put(ncid_new, "time", "calendar", "standard")
ncatt_put(ncid_new, "elev", "var_name", "elev")

#--------------------------
# Close our new output file
#--------------------------
nc_close(ncid_new )

#=========================================tg


#----------------
# Make dimensions
#----------------
xvals<- All$xyCoords$x
yvals<- All$xyCoords$y
#xvals <- 1:512
#yvals <- 1:256

nx <- length(xvals)
ny <- length(yvals)

xdim <- ncdim_def( 'lon', 'degrees_east',longname="longitude", xvals )
ydim <- ncdim_def( 'lat', 'degrees_north',longname="latitude", yvals )
#tdim <- ncdim_def( 'time', 'climatology',longname="time", 0, unlim=TRUE )
#Try and mirror what is in the climex example file to see if that allows for ingestion
tdim <- ncdim_def( 'time', 'days since 2013-01-01 00:00',longname="time", 1:1461, unlim=TRUE )
#---------
# Make var
#---------
mv <- 1.00000002004088e+20     # missing value
var_temp <- ncvar_def( 'tg', 'C', list(xdim,ydim,tdim), mv )

#---------------------
# Make new output file
#---------------------
output_fname <- 'tg_0.25deg_reg_v16.0.nc'
ncid_new <- nc_create( output_fname, list(var_temp))

#-------------------------------
# Put the tg climatology data in the file
#-------------------------------
#TmaxDelta$Data[is.nan(TmaxDelta$Data)] <- 0

ncvar_put( ncid_new, var_temp, All$Variable$Data, start=c(1,1,1), count=c(nx,ny,1461))
#---------
#ADD SOME MORE ATTRIBUTES
#---------
ncatt_put(ncid_new, "tg", "long_name", "Maximum Temperature 2 m above ground")
ncatt_put(ncid_new, "tg", "standard_name", "air_temperature")
ncatt_put(ncid_new, "tg", "cell_methods", "time: maximum within days time: mean over days")
ncatt_put(ncid_new, "lon", "standard_name", "longitude")
ncatt_put(ncid_new, "lat", "standard_name", "latitude")
ncatt_put(ncid_new, "time", "standard_name", "time")
ncatt_put(ncid_new, "time", "calendar", "standard")
ncatt_put(ncid_new, "tg", "var_name", "tg")

#--------------------------
# Close our new output file
#--------------------------
nc_close(ncid_new )

#Then try and read in the files using myfun YOU NEED TO INCLUDE THE FOLDER PATH HERE.  
# If you run the code above, it saves to the user's home directory.
myfun(path="~", year = 2014, offset = 264, resolution = 0.25,internal = TRUE))

Updated from this point with responses to both comments

I checked the suggestions by R.Hijmans: First with my dataset:

x <- eobs_data
 class(x)

returns:

"list"

then

y <- x[[3]]
class(y)

returns:

[1] "RasterBrick"
attr(,"package")
[1] "raster"

then

names(y)

returns:

[1] "X2013.01.02" "X2013.01.03" "X2013.01.04"……

then

Y

Returns:

class      : RasterBrick 
dimensions : 189, 420, 79380, 1461  (nrow, ncol, ncell, nlayers)
resolution : 0.25, 0.25  (x, y)
extent     : -40.5, 64.5, 25.25, 72.5  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
source     : C:/Users/elev_0.25deg_reg_v16.0.nc 
names      : X2013.01.02, X2013.01.03, X2013.01.04, X2013.01.05, X2013.01.06, X2013.01.07, X2013.01.08, X2013.01.09, X2013.01.10, X2013.01.11, X2013.01.12, X2013.01.13, X2013.01.14, X2013.01.15, X2013.01.16, ... 
Date       : 2013-01-02, 2017-01-01 (min, max)
varname    : elev

Second with the Phenor dataset:

x <- eobs_data
 class(x)

returns:

"list"

then

y <- x[[3]]
class(y)

returns:

[1] "RasterBrick"
attr(,"package")
[1] "raster"

then

Names(y)

returns:

“layer”

then

Y:

returns:

class      : RasterBrick 
dimensions : 201, 464, 93264, 1  (nrow, ncol, ncell, nlayers)
resolution : 0.25, 0.25  (x, y)
extent     : -40.5, 75.5, 25.25, 75.5  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
source     : C:/Users/…/elev_0.25deg_reg_v16.0.nc 
names      : layer 
varname    : elevation

It appears then, in the dataset that I created, names(y) is returning the dates, while in the example dataset, it returns the character string "layer". I don't see where this character string is coming from, but I am very new to R, so I could be missing something obvious.This explains why the error is being thrown.

@Wibeasley I ran your suggestion of

altitude = t(raster::as.matrix(raster::subset(eobs_data[[3]], layers)))

and it resulted in the same error for the same reason (i.e. when I applied the commands the R. Hijmans suggested, names(y) returned the dates as opposed to the word "layer". I obviously need to change the structure of my own dataset to reflect the inclusion of the "layer" somehow.

matlabcat
  • 172
  • 2
  • 12
  • Can you please read and incorporate elements from [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Especially the aspects of using `dput()` for the input and then an explicit example of your expected dataset? Use a small subset of your dataset, or at least a small representation with fake data points. – wibeasley Dec 09 '20 at 14:34
  • @wibeasley The data (list) that I am using was created in another R package. I don't know how to create the exact hierarchy that exists within the variables in the dataset. There's no guarantee I would get it precisely the same as the dataset I am using, which means it may not reproduce the example. I'll look further in to it. In the meantime, I suppose I'm hoping someone recognises the error without the data. – matlabcat Dec 09 '20 at 15:20
  • I gotcha -that's tough to debug and I like how you're going through it. I'm not sure either why the sample dataset is working if you don't see a `layers` column in the step-through debugger. I'm guessing it's related to [S4 dispatching](http://adv-r.had.co.nz/S4.html). Is `layers` inside `eobs_data[[3]]$data` or another element? Do the elements of `eobs_data[[3]]` look the same between the two datasets? – wibeasley Dec 09 '20 at 15:41
  • @wibeasley I had added some code that creates the necessary file to reproduce the example. – matlabcat Dec 09 '20 at 16:11
  • Yikes, I see why you were reluctant to post an example dataset. I made some small edits to make it easier for others to run and step-through. Revert any place where I didn't represent your intent well. – wibeasley Dec 09 '20 at 17:56

3 Answers3

0

Without knowing much about this field, my guess is that the line

altitude = t(raster::as.matrix(eobs_data[[3]]$layer))

should be changed to this:

altitude = t(raster::as.matrix(raster::subset(eobs_data[[3]], layers)))

But please carefully check this because I only got it to run from a syntax-perspective. I can't tell what the intended output should be.

Do Ti, Pi, and altitude represent three dimensions? I'm guessing the structure of the S4 class changed and the equations for Ti and Pi were updated, but altitude was overlooked. That doesn't satisfactorily explain why the code worked for your example dataset though.

Where did myfun() come from? It fails later because it can't find the daylength() function --but I'm guessing that's unrelated to the error raised in this post.

wibeasley
  • 5,000
  • 3
  • 34
  • 62
  • I have responded in the main body of the text re: your suggestion. In response to your questions. 'Ti','Pi' represent temperature and precipitation, while 'altitude' is the elevation. `myfun()` is a function from the phenor packagewhich I copied and pasted and renamed `'myfun'`. – matlabcat Dec 10 '20 at 09:49
0

You are going about debugging this the wrong way. You need to unpack this, and please edit your question. At this point all that matters is this line

 altitude = t(raster::as.matrix(eobs_data[[3]]$layer)

So first get

 x <- eobs_data
 class(x)

Presumably it is a list? Or a Raster*?

Then do

 y <- x[[3]]

If this still works, we are getting closer. What is it? A RasterLayer or RasterBrick? Show it

 class(y)
 names(y)
 y

Perhaps do the same for a case that works. Then you see the difference. And then, perhaps, we can help you. At that stage it would be also be trivial to make a reproducible example.

Based on your updated answer, you could replace

eobs_data[[3]]$layer

with

eobs_data[[3]][[1]]

And it would probably work. But I would not trust it, as in one example eobs_data[[3]] has one layer (with the default name "layer") --- and in the other there are 1461. That does not seem right. So your mistake is perhaps earlier?

Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
  • I have responded to your suggestions in the main body of the question. – matlabcat Dec 10 '20 at 09:50
  • examining the 'names' for my variables and the variables in the example dataset has got me thinking that the problem could be related to the fact that I have allocated the same dates/time series to the 'elevation' netcdf as I have to the other variables. This doesn't make sense, as elevation is atemporal (on the timescale I'm interested in anyway:) I will write out the nc file again with no dates before I apply the raster function to see if that changes anything. Maybe I can manipulate the 'names' in the raster then too. – matlabcat Dec 10 '20 at 11:13
0

The problem here was that one of the netcdf files (elevation) being read into myfun had a 'time' dimension (which I wrote into the file myself). When I removed time, the raster::brick function created the name 'layer' within eobs_data[3]], which allowed the function to perform as anticipated. Thank you both for your input. I figured it out based on your helpful suggestions.

matlabcat
  • 172
  • 2
  • 12