0

I'm trying to speed up some raster processing I'm doing using terra::focal by using focalCpp.

Here is some example data with 1s and NAs included to replicate an actual dataset

nr <- nc <- 50
r <- rast(ncols=nc, nrows=nr, ext= c(0, nc, 0, nr))
values(r) <- rep(c(rep(NA, 10), rep(1, 10), seq(1,8), rep(1,12), rep(NA,5), rep(1,15),seq(1,8), rep(NA,12), seq(1,20)), 50)
 

This is the original function used with focal that I'm trying to duplicate in Rcpp

fxnpercent = function(x) {
  q=x # make a copy of x
  q[q!=1] = NA # q becomes just 1s
  length(q[!is.na(q)])/length(x[!is.na(x)]) * 100 # gets percentage of 1s
}

This is the original focal function with a window approximated by a 200m buffer

# moving window matrix 
mat = matrix(1,15,15) # create matrix of 1's that envelopes the extent of the buffer
gr = expand.grid(1:nrow(mat), 1:nrow(mat)) # df of all pairwise values based on row/col index
center = 8 # centroid index of the square grid
gr$dist = sqrt((gr$Var1-center)^2 + (gr$Var2-center)^2) # euclidean distance calucation 
threshold = 200/30 # 200m threshold is converted into number of pixels from center
gr$inside = ifelse(gr$dist < threshold, 1, NA) # if distance is less than threshold, grid value is one, otherwise NA
w = matrix(gr$inside, 15,15)  # Using gr$inside, place indexed values into matrix of original dimensions

#output percent from moving window
percent = terra::focal(x=r, w=w, fun=fxnpercent, na.policy="all")

And this is my attempt to duplicate the fxnpercent function in Rcpp

cppFunction( 
    'NumericVector fxnpercent(NumericVector x, size_t ni, size_t nw) {
        NumericVector out(ni);

        // loop over cells
        size_t start = 0;
        for (size_t i=0; i<ni; i++) {
            size_t end = start + nw;
            // compute something for a window
            double v = 0;
            // loop over the values of a window
            for (size_t j=start; j<end; j++) {
                if (x[j] != 1) {
                  v += std::nan("");
                }
                else {
                  v += x[j];
                }
                
            }
            
            NumericVector x1 = x[!is_na(x)];
            NumericVector v1 = v;
            NumericVector v2 = v1[!is_nan(x)];
            
            size_t v2size = v2.size();
            size_t x1size = x1.size();
            
            out[i] = (v2size / x1size) * 100;
            
            start = end;
        }
        return out;
    }'
  )

After lots of troubleshooting to get the syntax correct in Rcpp, I try to run this function with focalCpp and I have an error.


percent = focalCpp(r, w=w, fun=fxnpercent, na.policy="all")

#my current error

Error: [focalCpp] test failed

I think I need to do some calculations within the window loop in the Rcpp function but I'm having trouble with understanding how to set it up to work correctly.

user_123
  • 62
  • 12
  • 1
    I'm not diving into this but would like to mention that you should usually use `R_xlen_t` for indices and not `size_t`. I don't know why the `focalCpp` documentation demands the latter. – Roland Jul 28 '23 at 04:48
  • 2
    The key to debugging this should be to find out what was tested for and failed. – Roland Jul 28 '23 at 04:52
  • 1
    This seems to be the test that failed: https://github.com/rspatial/terra/blob/master/R/focal.R#L361 – Roland Jul 28 '23 at 04:57
  • Thanks for your tips @Roland, and for pointing me to where the error message is coming from. There seems to be an issue with the C++ code and I'm still looking to better understand what focalCpp needs from a custom function – user_123 Jul 28 '23 at 12:40

1 Answers1

1

NaN checking is not done via is_nan() but rather using std::isnan()

#include<Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector fxnpercent(NumericVector x, size_t ni, size_t nw) {
  NumericVector out(ni);
  
  // loop over cells
  size_t start = 0;
  for (size_t i=0; i<ni; i++) {
    size_t end = start + nw;
    // compute something for a window
    double v = 0;
    double totalCount = 0;
    double countOfOnes = 0;

    // loop over the values of a window
    for (size_t j=start; j<end; j++) {
      if (!std::isnan(x[j])) {
        totalCount++;
        if (x[j] == 1) {
          countOfOnes++;
        }
      }
    }

    if (totalCount > 0) {
      out[i] = (countOfOnes / totalCount) * 100;
    } else {
      out[i] = std::nan(""); // it is NaN if all are NaN
    }
    
    start = end;
  }
  return out;
}

MaikeruDev
  • 1,075
  • 1
  • 19
  • 1
    Thank you so much @MaikeruDev! This is exactly what I'm looking for. – user_123 Aug 04 '23 at 15:46
  • 2
    Careful. That is true in plain C++ but different on R data as R has NA, NaN and NULL, See eg the classis answer by @KevinUshey here: https://stackoverflow.com/questions/26241085/rcpp-function-check-if-missing-value/26262984#26262984 – Dirk Eddelbuettel Aug 04 '23 at 23:53
  • 1
    To clarify/extent: in C(++) code from R we usually assign with `NA_REAL` which is also aliased to `R_NaReal`. Doing so somewhat better _in the context of R_ because we have cases where we need `NA_INTEGER` (aka `R_NaInt`), `NA_LOGICAL` (also `R_NaInt` as logicals are intgers to allow for `NA`), `NA_STRING` aka `R_NaString` and so on covering the `NA` extensions R has. For the sole case of a `double` using the standard also works so the answer is fine. But ... the additional R considerations matter at times. – Dirk Eddelbuettel Aug 06 '23 at 14:10