Application needs to create large PDFs and download them, minimizing Java Heap Memory usage

Question

We have a use case to create and download a pdf file that will be a bunch of pdfs combined. Sometimes up to thousands of pdf files from an AWS S3 bucket into one pdf file. As well as another pdf file that has forms for data entry. These two pdf files will then create a zip file and then downloaded by/to the user.

We are using iText 5.5.8, Java 8, Spring.

These are done on demand based on user entered criteria. We have had majority of these zip/pdfs total over 250MB in file size, typically taking gigs on Heap space memory.

We thought about pre-generating them over night. However, that alone could be thousands upon thousands of combinations and all would have to be re-generated every night, since each day the data could change and they want the pdf up to the minute.

I was thinking of ways to generate it one piece at a time saving the results into S3 via some command line tools I call from the application. But not sure if that is a good solution

There must be some way to achieve this and keep the files out of Heap space memory. It looks like the codebase is all about appending InputStreams which would be all in memory.

List<InputStream> inputStreams = new ArrayList<>();
String fileName = "";
for(DocumentEntity documentEntity : documentEntities){
    fileName = documentEntity.getFileName();
    if (fileName.length()> 0 && fileName.lastIndexOf(".") != -1) {
        if (".pdf".equals(fileName.substring(fileName.lastIndexOf(".")))) {
            byte[] fileAsBytes = documentFileService.getFile(documentEntity.getFilePath());
            inputStreams.add(new ByteArrayInputStream(fileAsBytes));
        }
    }
}

Any suggestions or possible other solutions within iText or other products?

"will be a bunch of pdfs combined?", can't undertand why jasper report sounds like you are using itext to combine them? — Petter Friberg, Apr 05 '16 at 07:53
Thanks Petter. Yeah it looks like iText. Our app creates PDFs with both and thought this one was Jasper. Will update the post. Thanks. — bytor99999, Apr 05 '16 at 18:51
Great that you found a duplicate the correct way to go is to close it as such so your question remains for search engine (and any new answers can be posted on that). It would be great for the community if you accept it. — Petter Friberg, Apr 06 '16 at 07:09
Thanks Petter. Not quite sure what you mean here. Anyway, it is over. I think SO has some very bizarre "requirements" I am just asking and found an answer. "It would be great for the community if you accept it." Sorry but that sounds really rude. — bytor99999, Apr 08 '16 at 20:37
Yeah SO is fairly complex (so you can have a lot of fun), Since you posted an answer telling your answer was in another question, I purposed a duplicate (which means that your question should point to the other question), so no answer here (answers only on duplicate), you could accept that but I guess other users saw this before and agreed, hence they marked it before you could. — Petter Friberg, Apr 08 '16 at 20:44
My intention was absolutely not trying to be rude, but to be helpful (both to you and the community), sorry for sounding rude, but trust me I'm only trying to do the correct action and make SO great for everyone. — Petter Friberg, Apr 08 '16 at 20:46
Ah, I think it was the state of things when I got back here that makes it such. I understand. I just didn't understand the statement which made it sound rude, because I first looked for some "accept" button and didn't see any, so then had to interpret it the other way. — bytor99999, Apr 08 '16 at 22:00

Application needs to create large PDFs and download them, minimizing Java Heap Memory usage

0 Answers0