3

I've created a tsibble of ~75K time series in R Studio on my local machine.

I'm looking for ways to speed up the processing time before I migrate the process to a VM with more processing power.

Does Fable handle all of the parallel processing in the background or are there more opportunities to make the code more efficient?

Here is an example of my code

plan(multisession, gc= TRUE)
tic()
results <- train %>%
  group_by_key() %>%  
  model(my_dcmp_spec) %>% 
  forecast(h="10 weeks") %>% 
  ungroup()
toc()

Thank you in advance!

Axeman
  • 32,068
  • 8
  • 81
  • 94

1 Answers1

4

Currently fable will model each of the series in parallel (model()) according to your plan(). The forecasts will not yet be done in parallel, but this is planned for an upcoming release: https://github.com/tidyverts/fabletools/issues/268

  • I would need a specification. When model() works in parallel, does it split data, model or both? Assuming we have a tsibble with a key that includes three levels and willing to test modelA and modelB, will the parallelization split into either six, three or two streams? – Andrea May 17 '22 at 09:27
  • Currently it is split into 6, however there is some argument that splitting 3 into 2 would be better (less transfer of data to worker nodes). – Mitchell O'Hara-Wild May 17 '22 at 10:45
  • Thanks Mitch. Your support is always extremely valuable – Andrea May 18 '22 at 12:13
  • @MitchellO'Hara-Wild how about the interpolate function? Will that parallelize? – Alfredo G Marquez Sep 08 '22 at 03:51
  • 1
    Not currently, but when implemented it will without changes to your code. – Mitchell O'Hara-Wild Sep 08 '22 at 04:20