I am trying to checkpoint the rdd to non-hdfs system. From DSE document it seems that it is not possible to use cassandra file system. So I am planning to use amazon s3 . But I am not able to find any good example to use the AWS.
Questions
- How do I use Amazon S3 as checkpoint directory ?Is it just enough to call ssc.checkpoint(amazons3url) ?
- Is it possible to have any other reliable data storage other than hadoop file system for checkpoint ?