-1

I am currently using RStudio on my Macbook Pro.

R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

When using the agnes() function from the cluster package I received the error message:

Error: vector memory exhausted (limit reached?)

To solve I followed the steps mentioned in the answer to the following question: R on MacOS Error: vector memory exhausted (limit reached?)

Now running the same function I receive R session aborted message. R encountered a fatal error. The session was terminated.

Any other solutions?

  • When you start playing with how much memory can be allotted, you start playing with fire. You haven't described your data, but I'm guessing it's rather large. Is there anyway to subset the data so that you perform your clustering on a smaller set? (I'm guessing not.) Do you have a larger computer available? – r2evans Oct 30 '18 at 15:19
  • The dataset is a data frame of 162,424 entries x 3 columns, is that rather large? – Alicia Sara Davis Oct 30 '18 at 16:43
  • 1
    In general terms (outside of cluster analysis), not even close, but since I don't use `agnes` I'm not familiar with its inner workings to know how it might "explode" the data in its workings. – r2evans Oct 30 '18 at 16:49

1 Answers1

0

AGNES needs at least two copies of a distance matrix.

Now if you have 100.000 instances, double precision (8 bytes) that means we are talking about memory usage on the order of 160000000000 bytes. That is 160GB. Not including the input data, or any overhead. If you are lucky, the R version of AGNES only stores the upper triangular matrix, which would reduce this by a favor of 2. But OTOH if it did, it would likely produce an integer overrun at about 64k objects.

So you probably need to choose a different algorithm than AGNES, or reduce your data first.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194