0

I am trying to perform stream-static join, my static table is less than 500 MB in size and i had cached it so that when the underlying table is refreshed it wont impact my stream-static join. I tried to check the DAG and i noticed every microbatch the .cache() step is being executed.

Is it true that in spark structured streaming that even, if we cache the static dataset, the microbatch is going to execute the step every microbatch ?

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137

0 Answers0