0

I have a file that I want to convert from wide to long format. but when I use the gather() function the size increases a lot.
the dataset starts with the size of 332MB (1048498 obs. of 64 variables)
after 1 gather() the size is 3GB, and after a second it is 32.3 GB (177196162 obs. of 42 variables)

does anyone know if this is normal behavior?

edit: example

library(nycflights13)
library(dplyr)

nycflightData <- dplyr::full_join(planes, flights, by = "tailnum")
nycflightDataLonger <- gather(testData, planeVar, planeInfo, tailnum,
                              type, manufacturer, model, engine,
                              engines, seats, convert = TRUE)

this dataset goes from 49MB to 270MB

tertra
  • 165
  • 4
  • 11

1 Answers1

0

I just realize that it does make sense, as when going from wide to long the number of original observations get added for each column (exept the 1st)

tertra
  • 165
  • 4
  • 11