We have a use case to create and download a pdf file that will be a bunch of pdfs combined. Sometimes up to thousands of pdf files from an AWS S3 bucket into one pdf file. As well as another pdf file that has forms for data entry. These two pdf files will then create a zip file and then downloaded by/to the user.
We are using iText 5.5.8, Java 8, Spring.
These are done on demand based on user entered criteria. We have had majority of these zip/pdfs total over 250MB in file size, typically taking gigs on Heap space memory.
We thought about pre-generating them over night. However, that alone could be thousands upon thousands of combinations and all would have to be re-generated every night, since each day the data could change and they want the pdf up to the minute.
I was thinking of ways to generate it one piece at a time saving the results into S3 via some command line tools I call from the application. But not sure if that is a good solution
There must be some way to achieve this and keep the files out of Heap space memory. It looks like the codebase is all about appending InputStreams which would be all in memory.
List<InputStream> inputStreams = new ArrayList<>();
String fileName = "";
for(DocumentEntity documentEntity : documentEntities){
fileName = documentEntity.getFileName();
if (fileName.length()> 0 && fileName.lastIndexOf(".") != -1) {
if (".pdf".equals(fileName.substring(fileName.lastIndexOf(".")))) {
byte[] fileAsBytes = documentFileService.getFile(documentEntity.getFilePath());
inputStreams.add(new ByteArrayInputStream(fileAsBytes));
}
}
}
Any suggestions or possible other solutions within iText or other products?