I checkpointed a rdd
which takes very long to compute. Then I executed many jobs on such a rdd
. Eventually, one of this job failed and the driver shutdown during night. Now I need to recover the checpointed data but I can't. There are many questions like this in SO but none of them answer the question. f.e.:
How to read checkpointed RDD <=
The only answer replicates the documentation. Which Is useless
How to recover from checkpoint when using python spark direct approach? <=
Is about streaming context.
My enviroment is azure databricks notebooks spark 2.4.3
and python 3