0

I am trying to use R package dtw to calculate the distance between two numeric vectors. Here is a sample of my code:

testNumbers <- sample(seq(from = 1, to = 60), size = 60000, replace = TRUE)
testNumbers2 <- sample(seq(from = 1, to = 60), size = 60000, replace = TRUE)

Sys.setenv('R_MAX_VSIZE'=32000000000)
Sys.getenv('R_MAX_VSIZE')
dtw(testNumbers, testNumbers2, distance.only = TRUE)

I will be using wav files that have been decoded, but that hasn't worked either, so I've been using this sample data. As per this post, I checked to make sure I am running the 64bit version of R (and I think that I am, because when I start R/R studio I see this: Platform: x86_64-apple-darwin15.6.0 (64-bit). I also changed the memory size (and checked the setting) in the code above as per this post, but I still get this error message Error: vector memory exhausted (limit reached?)

Lisa
  • 909
  • 2
  • 13
  • 31
  • 1
    Well, do you have enough RAM? Assuming `dtw` returns a `60,000x60,000` distance matrix, in the worst case scenario you need up to ~30 GB of RAM (`sizeof(double) * 60000 * 60000 / 10^9`, assuming `sizeof(double)` = 8 byte). Have you tried establishing that `dtw` works as expected with smaller vectors? – Maurits Evers Aug 26 '18 at 22:05
  • I didn't realize how much RAM that would take up. Guess I need to find a computer with much more RAM. I tried it on a smaller frame and it does work. thanks! – Lisa Aug 26 '18 at 22:46
  • @MauritsEvers, do you know if there is a less computationally-intensive way to do dynamic time warping? thanks! – Lisa Aug 27 '18 at 14:09
  • 1
    @Lisa If you're only using the `symmetric1` or `symmetric2` step patterns and no backtracking, my [`dtw_basic`](https://rdrr.io/cran/dtwclust/man/dtw_basic.html) implementation only allocates two columns of the distance matrix at once. See also [this answer](https://stackoverflow.com/a/50776685/5793905). – Alexis Aug 28 '18 at 17:06
  • @Alexis, thanks for your comment! I tried it but I am still getting the error with my relatively short sample data. thanks anyway! – Lisa Aug 30 '18 at 17:00
  • 2
    @Lisa Unfortunately I can't reproduce your problem, with your sample data I get no error in my machine (although it does take around a minute to finish). The other answer you linked to seems to be wrong, you cannot set `R_MAX_SIZE` after starting R, see [this answer](http://r.789695.n4.nabble.com/R-3-5-0-vector-memory-exhausted-error-on-readBin-tp4750237p4750244.html) (and the other posts there too). – Alexis Aug 30 '18 at 18:00
  • 1
    Oh and BTW, if you get that problem fixed, also consider using either [Keogh's](https://rdrr.io/cran/dtwclust/man/lb_keogh.html) or [Lemire's](https://rdrr.io/cran/dtwclust/man/lb_improved.html) lower bound. Depending on how noisy your data is, it might be enough, and considerably faster. – Alexis Aug 30 '18 at 18:12

0 Answers0