0

I have a 1360x92735 csv dataset and I have to reduce dimensionality using FSelector package for R (information.gain()), but it requests a lot of ram.

My question is, can I use the ff package in combination with FSelector? If yes, how?

p.s. I have 8GB of ram and 8GB of swap on linux.

Thanks.

[EDIT]

I've try to use ff and FSelector package with iris dataset. It seems to work well, but now I've a problem with ff.

My csv dataset is 1303x92735 and when I try to use an ff object to convert a dataframe with as.ffdf(), or to directly load dataset with read.csv.ffdf(), R crash with "write error".

Here someone has same problem, but I don't understand if reachs a solution or not.

Thanks.

Community
  • 1
  • 1
Descanso7
  • 55
  • 1
  • 7

1 Answers1

0

The error is likely due to the fact that ff opens a file for each column in the ff data frame. You have 92,735 columns which is likely to be many more than your system configuration for the max number of open files. I've answered this on SO here.

Community
  • 1
  • 1
Chris Townsend
  • 3,042
  • 27
  • 31