I am having trouble with Spring batch that saves csv file into s3 location.
i have a long list of csv files that i have to read from. and depends on the status of the each row, if it is active then write it to the csv file and after the batch job store in the s3 bucket.
So far, I have read some of the articles about it. Read AWS s3 File to Java code it says File object does not understand about s3. So the code below is working for local development, but as soon as i set the location to s3 bucket and run it in a production environment, it would not work.
void write(List<? extends Subscription> items throws Exception {
String location = "/src/main/java/resources/output.csv"
FileWriter writer = new FileWriter(location, true);
BufferWriter bw = new BufferWriter(writer);
new CSVPrinter(bw, CSVFormat.DEFAULT.withDelimiter('|' as char))
.withCloseable { CSVPrinter csvprinter ->
items.each { subscription ->
csvPrinter.printRecord(
subscription.id,
subscription.subscription
)
}
csvPrinter.flush()
}
}
Since the file is very long, I had to set the second argument of FileWriter to true as it would append the data to csv. Otherwise, the csv file would contain only the data from the last batch. However, i cannot pass the string of the s3 bucket location to FileWriter. Because it does not understand the s3 bucket.
So the second option I tried was
void write(List<? extends Subscription> items throws Exception {
WritableResource resource = resourceLoader.get("s3://s3-bucket-location/output.csv") as WritableResource;
def writer = new OutputStreamWriter(resource.getOutputStream())
new CSVPrinter(writer, CSVFormat.DEFAULT.withDelimiter('|' as char))
.withCloseable { CSVPrinter csvprinter ->
items.each { subscription ->
csvPrinter.printRecord(
subscription.id,
subscription.subscription
)
}
csvPrinter.flush()
}
}
It is writing into csv files and store it into s3 bucket location as expected. but in this case, it will contain only the data from the last batch as it is not appending, it is simply overwriting in each batch job.
I am stuck in this and not sure how to figure this out. I would very very appreciate if someone could help me figure this out. I am using groovy as a language by the way.
Thank you in advance.
Edited. I have updated the code as below. but now it is not writing to a file anymore. the file is not even generated. also when i used withClosable instead of with, it throws an exception and complains about stream closed. I am not sure why i get this exception.
@Autowired
ResourceLoader resourceLoader
WritabeResource resource
OutputStreamWriter writer
CSVPrinter csvPrinter
void setup() {
this.resource = this.resourceLoader.getResource("s3://output.csv")
this.writer = new OutputStreamWriter(this.resource.getOutputStream))
this.csvPrinter = new CSVPrinter(this.writer, CSVFormat.DEFAULT.withDelimiter('|' as char))
}
@Override
void write(List<? extends Subscription> items) throws Exception {
// not writing to a file
this.csvPrinter.printRecord("test", "test1")
// not writing to a file
csvPrinter.with {
items.each {Subscription subscription ->
csvPrinter.printRecord(
subscription.id,
subscription.subscription
)
}
// not writing to a file
items.each {it ->
csvPrinter.printRecord(it.id, it.subscription)
}
}