I'm trying to spread a very long data frame (17,000,000 rows; 111.2MB RDS file) into a wide format by a variable with ~2,000 unique values. Running this on a 16 cores 64GB RAM linux machine results in a Error: cannot allocate vector of size 3132.3GB
.
The dplyr
code below works perfectly on smaller datasets (~1/3 the size).
data <- data %>%
rowid_to_column() %>%
spread(key = parameter_name, value = value) %>%
select(-rowid)
Any idea to get this done? More efficient coding?