I am saving an ML model to an S3 bucket. After a long search this thread helped me find a solution. My code looks as follows:
sc.parallelize(Seq(model), 1).saveAsObjectFile("s3a://bucket/nameModel.model")
The first time a run this job everything went fine. The second time I got
FileAlreadyExistsException: Output directory "s3a://bucket/nameModel.model" already exists`
I didn't find a solution to overwrite this model. So I first tried to delete the existing model before saving it:
val instanceProfileCredentialsProvider = new com.amazonaws.auth.InstanceProfileCredentialsProvider()
val amazonS3Client = new AmazonS3Client(instanceProfileCredentialsProvider)
amazonS3Client.deleteObject(new DeleteObjectRequest("bucket", "nameModel.model"))
sc.parallelize(Seq(model), 1).saveAsObjectFile("s3a://bucket/nameModel.model")
No succes, I still get the same exception. The new code doesn't seem to delete the existing model. Is there maybe another way to overwrite or delete the current ML model from the s3 bucket?