Calculate similarity in spatial utilization between .tif rasters using Earth Mover's Distance (EMD)

Question

I am analyzing animal tracking data within an acoustic receiver array using dynamic Brownian bridge movement models. The dataset contains an animal identifier and every detection on a receiver has a timestamp and a lat/lon coordinates (in decimal degrees). Further, there is are two additional grouping variables "studyperiod" and "sgroup".

  Elasmo            Datetime      Lat       Lon studyperiod sgroup
1 X10141 2019-02-13 15:29:00 25.72441 -79.30922      IS2019  naive
2 X10141 2019-02-13 15:44:00 25.72441 -79.30922      IS2019  naive
3 X10141 2019-02-13 15:48:00 25.72441 -79.30922      IS2019  naive
4 X10141 2019-02-13 17:17:00 25.72441 -79.30922      IS2019  naive
5 X10141 2019-02-13 17:20:00 25.72441 -79.30922      IS2019  naive
6 X10141 2019-02-13 18:00:00 25.72441 -79.30922      IS2019  naive

I then used the GitHub package 'dBBMMhomeRange' (/https://github.com/SimonDedman/dBBMMhomeRange) to calculate individual-level Utilization distributions first, which are then scaled, weighted, summed to group-level UDs. The resulting group-level UDs are saved as .ascii rasters. As the rasters had different extents, I imported them into ArcMap and specified a shared extent within the program. Rasters with the same shared extent were then exported from Arcmap and imported into R. The rasters have the following properties:

r1 <- raster("~/N_IS2017_sc.tif")
r4 <- raster("~/N_IS2020_sc.tif")

> r1;r4
class      : RasterLayer 
dimensions : 1450, 1264, 1832800  (nrow, ncol, ncell)
resolution : 50, 50  (x, y)
extent     : -31658.67, 31541.33, -36085.92, 36414.08  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : N_IS2017_sc.tif 
names      : N_IS2017_sc 
values     : 0, 0.0003713508  (min, max)

class      : RasterLayer 
dimensions : 1450, 1264, 1832800  (nrow, ncol, ncell)
resolution : 50, 50  (x, y)
extent     : -31658.67, 31541.33, -36085.92, 36414.08  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : N_IS2020_sc.tif 
names      : N_IS2020_sc 
values     : 0, 0.0004588088  (min, max)

Now, I would like to use Earth Mover's Distance as described by Kranstauber et al. (2016) to calculate similarity in space use between the different groups. To do so I used emdDists() function from the move package.

## create a raster stack
allrasters <- stack(r1,r4)

## EMD
emdDists <- emd(allrasters/cellStats(allrasters, sum), threshold = 700)

However, the function starts running but always crashes. Which makes me think that something I did was not correct.

So, this brings me to my two questions:

The corresponding paper uses a UDStack as object to supply to the EMDDists(). So I am unsure if it is possible to implement the EMD approach for here described .tif rasters as well?
If correct, are there ways to reduce the calculation power demands to R?

have your opened a second terminal and called `htop`, and watched what was going on in htop while `emd(` is running? From `?move::emd` examples, it looks like you want to calculate the probability surface first, then run `emd`, so break your last line into two, `my_stack <-(allrasters/cellStats(allrasters, sum))`, `my_emdDists <- emd(my_stack, threshold = 700)`, as a possibility. And you might want to look at [omd](https://github.com/sangwon-hyun/omd) for comparison. ~/notes section there is interesting if you git clone it to local. — Chris, Aug 31 '22 at 17:38
Thank you for the answer! I haven't used htop before and will give it a go. I have tried breaking the last line in two and calculated the probability surface first before running emd() but the outcome is the same and R crashes. — vheim, Sep 01 '22 at 05:18
`htop` gives a near real-time view of RAM usage, which is likely not the problem as you didn't say your system freezes. `gdal_translate -of AAIGrid file.tif file.asc` in terminal to address the .tif vs .asc . Then, do whatever is done to take .asc to UD, then `emd(`. On the dBBMhomeRange page, click on 107 commits and ask yourself, when did I download this relative to the commit dates. Useful to read through commits and comments as well, but issue is likely using proper raster type and processing to UD order. — Chris, Sep 01 '22 at 14:20
Ok, looked at the different rasters - I am at the stage of .asc rasters that all share the same projection and extent. I then created a RasterStack object as is needed in UDStack() which should allow me to change the RasterStack to a UDStack that can be used in emd(). However, this results in an error: `> test <- UDStack(stk) Error in validObject(.Object) : invalid class “.UDStack” object: FALSE` — vheim, Sep 07 '22 at 15:31
I've been reading `scaleraster.R` and `dBBMM.build.R`. (at github) I'd say sit back, read those, consider the `sapply(function(x) new('.UD'` (scaleraster.R- line 211, and more generally, `new('.UD', seems preferred to arrive at UD class objects. I have more reading to do, but getting .asc right (looking at .build.R) seems sufficient challenge to warrant 775 lines. — Chris, Sep 07 '22 at 17:30
I looked at the two functions you recommended and managed to create a UDstack object containing rasters as .UD objects converted from the .tif rasters. This allowed me to run emd(). I ran emd with 3 rasters only. That said, I will likely have to figure out how to run the calculations more efficiently, as even with the three rasters it took more than 2 hours and the memory usage was as high as 22 GB at times. I will update my answer with the code I used later. But, thank you for all your answers so far - they helped a lot! — vheim, Sep 12 '22 at 06:40

vheim · Accepted Answer · 2022-09-15T04:39:13.580

Thanks to comments from Chris (see above for background) I found the answer to my question. The important step was to convert the imported rasters to .UD class objects so that they can be used to create a UDStack. The code is pretty straightforward and quick using functions from the "raster" package and the "move" package.

library("move")
library("raster")

# Import rasters
r1 <- raster("~/R1.tif")
r2 <- raster("~/R2.tif")
r3 <- raster("~/R3.tif")

# create .UD class rasters
r1ud <- new(".UD", r1)
r2ud <- new(".UD", r2)
r3ud <- new(".UD", r3)

# create a UDStack needed for emd()
stk <- UDStack(as.list(r1ud, r2ud, r3ud))

# 'stk' can now be used within the emd() function as described by Kranstauber et al. 2017

Calculate similarity in spatial utilization between .tif rasters using Earth Mover's Distance (EMD)

1 Answers1