0

I am having trouble with Spring batch that saves csv file into s3 location.

i have a long list of csv files that i have to read from. and depends on the status of the each row, if it is active then write it to the csv file and after the batch job store in the s3 bucket.

So far, I have read some of the articles about it. Read AWS s3 File to Java code it says File object does not understand about s3. So the code below is working for local development, but as soon as i set the location to s3 bucket and run it in a production environment, it would not work.

void write(List<? extends Subscription> items throws Exception {
  String location = "/src/main/java/resources/output.csv"
  FileWriter writer = new FileWriter(location, true);
  BufferWriter bw = new BufferWriter(writer);

  new CSVPrinter(bw, CSVFormat.DEFAULT.withDelimiter('|' as char))
    .withCloseable { CSVPrinter csvprinter ->
    items.each { subscription -> 
      csvPrinter.printRecord(
        subscription.id,
        subscription.subscription
      )
    }
    csvPrinter.flush()
  }
}

Since the file is very long, I had to set the second argument of FileWriter to true as it would append the data to csv. Otherwise, the csv file would contain only the data from the last batch. However, i cannot pass the string of the s3 bucket location to FileWriter. Because it does not understand the s3 bucket.

So the second option I tried was

void write(List<? extends Subscription> items throws Exception {
  WritableResource resource = resourceLoader.get("s3://s3-bucket-location/output.csv") as WritableResource;
  def writer = new OutputStreamWriter(resource.getOutputStream())
  
  new CSVPrinter(writer, CSVFormat.DEFAULT.withDelimiter('|' as char))
    .withCloseable { CSVPrinter csvprinter ->
    items.each { subscription -> 
      csvPrinter.printRecord(
        subscription.id,
        subscription.subscription
      )
    }
    csvPrinter.flush()
  }
}

It is writing into csv files and store it into s3 bucket location as expected. but in this case, it will contain only the data from the last batch as it is not appending, it is simply overwriting in each batch job.

I am stuck in this and not sure how to figure this out. I would very very appreciate if someone could help me figure this out. I am using groovy as a language by the way.

Thank you in advance.

Edited. I have updated the code as below. but now it is not writing to a file anymore. the file is not even generated. also when i used withClosable instead of with, it throws an exception and complains about stream closed. I am not sure why i get this exception.

@Autowired
ResourceLoader resourceLoader
WritabeResource resource
OutputStreamWriter writer
CSVPrinter csvPrinter

void setup() {
  this.resource = this.resourceLoader.getResource("s3://output.csv")
  this.writer = new OutputStreamWriter(this.resource.getOutputStream))
  this.csvPrinter = new CSVPrinter(this.writer, CSVFormat.DEFAULT.withDelimiter('|' as char))
}

@Override
void write(List<? extends Subscription> items) throws Exception {
  // not writing to a file
  this.csvPrinter.printRecord("test", "test1")
  // not writing to a file
  csvPrinter.with {
    items.each {Subscription subscription ->
      csvPrinter.printRecord(
        subscription.id,
        subscription.subscription
      )
  }
  // not writing to a file
  items.each {it -> 
    csvPrinter.printRecord(it.id, it.subscription)
  }
}
Moon Rise
  • 1
  • 1
  • You are creating a `new CSVPrinter` in each call to `write`. This will create a file per chunk. Is this what you mean by `it will contain only the data from the last batch as it is not appending`? If that's the case, you probably need to extract the `CSVPrinter` as a field in your writer class. – Mahmoud Ben Hassine Sep 20 '21 at 06:24
  • Oh, I guess so. Let me try that theory really quick. – Moon Rise Sep 20 '21 at 12:49
  • i changed the extracted the csvPrinter, but now it is not writing to the file anymore. i spent a whole day searching to fix this issue, but it could not make it yet – Moon Rise Sep 20 '21 at 20:31

0 Answers0