0

I have attempted to use the following code to come up with a table of unique combinations of a bunch of variables.

V1=as.vector(CRmarch30[1])
V2=as.vector(CRmarch30[2])
V3=as.vector(CRmarch30[3])
V4=as.vector(CRmarch30[4])
V5=as.vector(CRmarch30[5])
V6=as.vector(CRmarch30[6])
V7=as.vector(CRmarch30[7])

As you may have already guessed, CRmarch30 is a dataframe with 7 columns. I converted each column into a vector. Then, i used the following code to create all unique combination of the 7 variables:

combo=expand.grid(V1,V2,V3,V4,V5,V6,V7)
combo

Instead of getting the output, I get the following error message:

 Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs

Could someone please help me with this please?

Freewill
  • 413
  • 2
  • 6
  • 18
  • Error: cannot allocate vector of size 512001.3 Gb In addition: Warning messages: 1: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) 2: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) 3: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) 4: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) – Freewill Dec 03 '14 at 15:03

1 Answers1

1

The as.vector is not converting it to vector For example:

V1=as.vector(CRmarch30[1])
V2=as.vector(CRmarch30[2])
 V3=as.vector(CRmarch30[3])

expand.grid(V1, V2, V3)
#  Var1 Var2 Var3
#1    1    5    0
#Warning message:
#In format.data.frame(x, digits = digits, na.encode = FALSE) :
# corrupt data frame: columns will be truncated or padded with NAs

 is.vector(V1)
 #[1] FALSE
 is.data.frame(CRmarch30[1])
 #[1] TRUE

You could have done

 V1 <- CRmarch30[,1]
 is.vector(V1)
 #[1] TRUE

But, you don't need to create vector objects. This could be done by (if you need unique combinations)

 do.call(expand.grid,lapply(CRmarch30,unique))

Or if the columns are already unique

 do.call(expand.grid, CRmarch30)

Or using data.table

 library(data.table)
 setDT(CRmarch30)[,do.call(CJ, lapply(.SD, unique))]

data

set.seed(22)
CRmarch30 <- as.data.frame(matrix(sample(c(NA,0:5), 10*3,
                                    replace=TRUE), ncol=3))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thanks alot akrun. when I attempt to use do.call(expand.grid,lappply(CRmarch30,unique)), i get an error "Error: cannot allocate vector of size 512001.3 Gb In addition: Warning messages: 1: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) 2: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) 3: In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) : Reached total allocation of 8089Mb: see help(memory.size) – Freewill Dec 03 '14 at 15:02
  • @user3007275 Looks like you have a large dataset. By doing `expand.grid` on such large dataset would be demanding as well. Perhaps, you can try it on a system with better memory or use `clusters`, or databases etc. – akrun Dec 03 '14 at 15:08
  • thanks akrun. Yes, I have 1500 observations. I also have another dataset with 16000 observations and I need to come up with similar combinations for that dataset as well. is there anyway to do it in R or excel ? How would clusters give me what i need ? – Freewill Dec 03 '14 at 15:52
  • @user3007275 I meant the `cluster` of systems. If you are working/studying in Universities, they might have clusters where you can run this. – akrun Dec 03 '14 at 15:57
  • Is it possible to run this in R such that it doesn't have to display the output but rather store that output as a dataset/dataframe and that can be exported to excel ? I'm guessing this error is due to R running the code and attempting to generate the output ? – Freewill Dec 03 '14 at 19:02
  • @user3007275 You may check packages like bigmemory. I don't have much experience in it. – akrun Dec 03 '14 at 19:05
  • @user3007275 Maybe this link also helps http://stackoverflow.com/questions/5171593/r-memory-management-cannot-allocate-vector-of-size-n-mb – akrun Dec 03 '14 at 19:12
  • I tried running the code on a server with 64GB memory and I still get an error message "Error: cannot allocate vector of size 26.9GB" – Freewill Dec 04 '14 at 02:30
  • @user3007275 It means that the memory is insufficient. As I mentioned earlier, you can check other options like clusters, database etc. YOu may also check `expand.ffgrid` from `library(ffbase)` (though I am not very sure) – akrun Dec 04 '14 at 03:08