2

I have been using S3 for checkpointing with Structured Streaming. However I am getting the FileNotFound Exception related to eventual consistency in S3.

Below is what I currently have with S3 checkpointing.

 val msg = testMsgs.writeStream.option("checkpointLocation", 
 s3://<bucket-name>/checkpoint123).foreach(writer).start

I am planning to switch to EMRFS as my spark job run in EMR.

How reliable is EMRFS and how do I use EMRFS for checkpointing?

Will there be a change in the way we implement checkpoint?

How do I enable EMRFS in EMR?

fledgling
  • 991
  • 4
  • 25
  • 48

0 Answers0