I am trying to take two data.frames (one ~2000 rows and four columns, one ~35000 rows and four columns) and do a cross-join using dplyr. I looked on SO and found a tip to make a dummy column on both data.frames and use dplyr:::inner_join. I tried that but R gave an error that it couldn't allocate a vector of a large size. (I got the same message using merge). Is there any way to do this? a ~70M row data.frame isn't that big memory-wise.
Asked
Active
Viewed 436 times
0
-
What data types are the columns? I agree that the size shouldn't be giving you a problem. – Brian May 20 '17 at 11:11
-
1You might find useful this : http://stackoverflow.com/questions/5171593/r-memory-management-cannot-allocate-vector-of-size-n-mb – Edgar Santos May 20 '17 at 12:06
-
Mostly character, a couple numeric. I've whittled it down to the essential columns that I will need after the join. – Ralph May 20 '17 at 15:17
-
Can you please provide the join code snippet you used, as well as the error message? – Yuval Spiegler May 21 '17 at 21:28
-
gc() did the trick... the join took no time at all after I restarted the R session and ran gc(). – Ralph May 22 '17 at 00:02