12

I am experiencing an error in R that says:

> Error: protect(): protection stack overflow

I have learned through googling that I need to increase:

> --max-ppsize

R-manual:Memory

This can only be set when starting R so I write the following in the command prompt:

C:\Program Files\RStudio\bin\rstudio.exe --max-ppsize=5000000

The error still occurs. I am running a 1500R x 26000C dataset.

How do I solve this problem?


Edit:

The problem occurs in a standard SVM() function, where I pass a dataset of the size 600R x 26.000C. It does not happen when the dataset is 600R x 12.000C.

> model <- svm(TARGET ~ ., data = ds, type = "C-classification", kernel "linear", scale = TRUE, cost = c, cross = k)
zx8754
  • 52,746
  • 12
  • 114
  • 209
Kasper Christensen
  • 895
  • 3
  • 10
  • 30
  • 3
    Your max value is invalid. The largest you can input is `--max-ppsize=500000` – HavelTheGreat Feb 25 '15 at 20:20
  • Elizion: Just tried your suggested correction. Still no effect... – Kasper Christensen Feb 25 '15 at 20:27
  • Are you using R 32 bit or 64 bit? – HavelTheGreat Feb 25 '15 at 20:28
  • 1
    I would advise you take a look at this then http://stackoverflow.com/questions/12767432/how-can-i-tell-when-my-dataset-in-r-is-going-to-be-too-large – HavelTheGreat Feb 25 '15 at 20:33
  • 3
    I suggest you create a minimal [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we we can see what you are trying to do. There might be problems in your code that leads to this memory issue. – MrFlick Feb 25 '15 at 20:39
  • I updated the question. The problem occurs in a SVM function and I am pretty sure it is due to data amount... – Kasper Christensen Feb 25 '15 at 21:10
  • It definately seems that your data set is too large for svm. How big is the data set (in MB?) Is it sparse? It might be converted to sparse matrix if so. Another idea would be to run the svm command on a subset of the original dataset, e.g. first 1000 rows, and then find out what's the biggest subset of the data which still can be handled by svm. – Balint Domokos Feb 26 '15 at 07:42

4 Answers4

10

I found a similar problem and that the actual issue was related to expansion of formulas into a model matrix. If you can get the data into that format without using formulas and then use the overload in the svm command (like many other models) that takes an X and y value instead, your probably may go away like mine did.

Eric Czech
  • 719
  • 9
  • 9
7

My way to fix a problem similar to yours:

  1. in the command-line, cd into the location of R progranm (e.g. C:\Program Files\R\R-3.1.3\bin\x64)
  2. in the command-line, Rgui.exe --max-ppsize=500000
  3. in the new open Rgui.exe, options("expressions"=20000)

Do the coding... NO original Error for me!!

Hsiao-Pei Lu
  • 71
  • 1
  • 2
2

The stack overflow might be a problem of too deep recursion, you might have a problem with a function calling itself recursively too many times, e.g. missing exit condition. In that case there's no point in increasing stack size, it will run out sooner or later anyway.

Balint Domokos
  • 1,021
  • 8
  • 12
1

On my Windows 10 laptop, I ran into this same issue running a tSNE analysis on a 945 x 22123 data.frame with Rtsne. Even after following the steps above to increase the ppsize and expressions (with Rstudio.exe substituted for Rgui.exe), I still got: > Error: protect(): protection stack overflow

On another thread I found the recommendation to use a matrix instead of a data.frame because the Rtsne function will have to convert to matrix anyhow and thus hold both the data.frame and the converted matrix in memory simultaneously. The easy fix was to change from:

tsne1 <- Rtsne(df, dims = 2, pca = TRUE)

to:

tsne1 <- Rtsne(as.matrix(df), dims = 2, pca = TRUE)
Dan Adams
  • 4,971
  • 9
  • 28