1

What if I break the long lineage using an action like mentioned below rather than checkpoint:

myDataframe.sqlContext().createDataFrame(myDataframe.toJavaRDD(), myDataframe.schema()).cache()

What impact will it have?

user4157124
  • 2,809
  • 13
  • 27
  • 42
  • Also, I found this link useful: https://stackoverflow.com/questions/57750413/spark-createdataframedf-rdd-df-schema-vs-checkpoint-for-breaking-lineage – Rituparno Behera Jun 15 '20 at 19:20
  • But i think, the myDataframe.sqlContext().createDataFrame(myDataframe.toJavaRDD(), myDataframe.schema()).cache() will have an issue with respect to the fault tolerance. – Rituparno Behera Jun 16 '20 at 05:21

0 Answers0