0

I have trouble analysing MODIS NDVI data (Terra and Aqua), which was downloaded from AppEEARS.

The data is over a certain extent over a period of time, so I stack the different layers in one file.

library(raster)
setwd("~/NDVI/")
NDVI <- list.files(full.names = TRUE, pattern = ".tif$")
sNDVI <- stack(NDVI)

No issues here, but when I look at the number of observations per layer, they are different between layers. This makes further analyses between layers very difficult.

 layers <- 0; layers <- as.data.frame(layers)
    for (i in 1:NDVI@data@nlayers){
      layers[i,] <- length(rasterToPoints(subset(NDVI, i)))
    }

summary(layers[,1])
 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
3330678 3340078 3342094 3341450 3343584 3345408

length(layers[,1])
[1] 142

I used to download the NDVI layers without problem from their previous system. Does anyone know what I am doing wrong, or how I could fix this? Cropping doesn't work.

Thank you for your help!

EDIT

The problem was caused by NAs in the data. I learned here how to visualise the NAs:

NDVI <- reclassify(NDVI, cbind(NA, 1000)) #or some value way higher than the rest of your data

plot(subset(NDVI, 1)

This revealed that the NAs were spread throughout my extent (I don't know why, as that was not the case with the previous data I downloaded). I needed to fill the NAs somehow and went for the less-than-optimal solution of filling them in with the value of the next non-NA value in that column:

library(zoo)
head(NDVI@data@values)
NDVI@data@values <- na.locf(NDVI@data@values, fromLast = T) # 'fromLast = T' makes the function take the value of the *next* rather than the *previous* value to fill in the NA.

Since the NAs made up less than 0.1 % of my total data, this didn't affect the mean value per layer I wanted to calculate.

This is probably not the way you want to deal with rasters, but thought I could be informative to edit my original question nonetheless. Cheers,

1 Answers1

1

This is a poor way to evaluate multivariate raster array data. The raster::rasterToPoints functions is dropping NA's thus, the differences in length. This is not a difference in n but, rather missing data.

The arrays match, otherwise you would receive an error with raster::stack. I would recommend applying common smoothing and nodata imputation procedures to this data. This will deal with the missing data, which is expected with this type of data due to effects such as clouds.

The development version of the spatialEco package has functions for raster timeseries data: smooth.time.series (NA imputation), sg.smooth (Savitzky-Golay smoothing) and raster.kendall (Kendall's Tau with Theil-Sen monotonic trend slope).

Jeffrey Evans
  • 2,325
  • 12
  • 18
  • Hi Jeffrey, and thank you for your quick reply. Unfortunately, your package 'spatialEco' no longer contains the function 'smooth.time.series' or 'sg.smooth' ([CRAN](https://cran.r-project.org/web/packages/spatialEco/spatialEco.pdf)). I tried the function 'raster.kendall': `NDVI1 <- raster.kendall(sNDVI)` but this takes at least several minutes to complete. What am I doing wrong? – user10187475 Aug 06 '18 at 16:00
  • 1
    I pointed out that the development version contained these functions not the CRAN version. I will have a new version, reflecting the additional functions, up on CRAN in a few weeks. You can install the development version using: devtools::install_github("jeffreyevans/spatialEco") as to the kendall function, it is doing quite a bit behind the scenes, at every pixel vector. It takes a bit of time to run, especially over long time-series or large rasters. – Jeffrey Evans Aug 06 '18 at 18:25
  • Hei Jeffrey, I wasn't able to use your function from the dev.version because it took way too long (let it run for more than 30min). Your reply did bring me to visit other pages that eventually made me 'fix' the issue, even though its probably not in a great way. Thanks for your help! – user10187475 Aug 07 '18 at 14:56