0

So I have the following code:

customers %>% 
  select(v1,v2,v3,v4,v5
         ID
         ) %>% 
  pivot_longer(-ID, names_to = "Variables", values_to = "count") %>% 
  filter(is.na(count) == FALSE) %>% 
  ggplot() + 
  geom_bar(aes(x = `Variables`, fill = count), position = "fill") +
  coord_flip() +
  theme_minimal()

The only options are TRUE or FALSE for each of the variables. But my 7 million rows of data are giving me space allocation issues, like the error seen above. What can I do to make this work?

John Thomas
  • 1,075
  • 9
  • 32
  • Try appending one pipe at a time to see which one is causing problems. You should do this in a new R session with only `customers` in your global environment. Intermediate results require memory as they are not garbage collected instantaneously. You may need to assign one of the intermediate results to a variable, `rm` everything else in your environment, call `gc` to trigger garbage collection, then proceed. – Mikael Jagan Jan 20 '22 at 05:05
  • BTW, logical and integer vectors are stored identically at C level, as `int` arrays. Each element occupies 4 bytes of memory. That's 32 bits, not 1 bit. – Mikael Jagan Jan 20 '22 at 05:08
  • I have no problem on my 64-bit machine evaluating this expression. I am using a simulated data frame `customers` consisting of 7 million rows, 1 integer variable `ID`, and 5 logical variables `v[1-5]` (with ~0.5% of entries being `NA`). Either `object.size(customers)` is huge or you have other objects in your environment (or other processes on your machine) consuming a lot of memory... – Mikael Jagan Jan 20 '22 at 05:45
  • so i am realizing it is the entire r markdown documen that is giving me issue – John Thomas Jan 20 '22 at 14:39

0 Answers0