I have a large dataset I am reading in R
I want to apply the Unique()
function on it so I can work with it better, but when I try to do so, I get this prompted:
clients <- unique(clients)
Error: cannot allocate vector of size 27.9 Mb
So I am trying to apply this function part by part by doing this:
clientsmd<-data.frame()
n<-7316738 #Amount of observations in the dataset
t<-0
for(i in 1:200){
clientsm<-clients[1+(t*round((n/200))):(t+1)*round((n/200)),]
clientsm<-unique(clientsm)
clientsmd<-rbind(clientsm)
t<-(t+1) }
But I get this:
Error in `[.default`(xj, i) : subscript too large for 32-bit R
I have been told that I could do this easier with packages such as "ff" or "bigmemory" (or any other) but I don't know how to use them for this purpose.
I'd thank any kind of orientation whether is to tell me why my code won't work or to say me how could I take advantage of this packages.