2

I have eyeballing this code for a long time, trying to reducing the amount of memory the code use and still it generated java.lang.OutOfMemoryError: Java heap space. As my last resort, I want to ask the community on how can I improve this code to avoid OutOfMemoryError

I have a driver/manifest file (.txt file) that contain information about the PDFs. I have about 2000-5000 pdf inside a zip file that I need to combine together. Before the combining, for each pdf, I need to add 2-3 more pdf pages to it. Manifest object holds information about a pdf.

try{
    blankPdf = new PdfReader(new FileInputStream(config.getBlankPdf()));
    mdxBacker = new PdfReader(new FileInputStream(config.getMdxBacker()));
    theaBacker = new PdfReader(new FileInputStream(config.getTheaBacker()));
    mdxAffidavit = new PdfReader(new FileInputStream(config.getMdxAffidavit()));
    theaAffidavit = new PdfReader(new FileInputStream(config.getTheaAffidavit()));

    ImmutableList<Manifest> manifestList = //Read manifest file and obtain List<Manifest>
    File zipFile = new File(config.getInputDir() + File.separator + zipName);
    //Extracting PDF into `process` folder
    ZipUtil.extractAll(config.getExtractPdfDir(), zipFile);
    outputPdfName = zipName.replace(".zip", ".pdf");
    outputZipStream = new FileOutputStream(config.getOutputDir() + 
                                                    File.separator + outputPdfName);
    document = new Document(PageSize.LETTER, 0, 0, 0, 0);
    writer = new PdfCopy(document , outputZipStream);
    document.open();    //Open the document
    //Start combining PDF files together    
    for(Manifest m : manifestList){
        //Obtain full path to the current pdf
        String pdfFilePath = config.getExtractPdfDir() + File.separator + m.getPdfName();
        //Before combining PDF, add backer and affidavit to individual PDF
        PdfReader pdfReader = PdfUtil.addBackerAndAffidavit(config, pdfType, m, 
                pdfFilePath, blankPdf, mdxBacker, theaBacker, mdxAffidavit, 
            theaAffidavit);
        for(int pageNumber=1; pageNumber<=pdfReader.getNumberOfPages(); pageNumber++){
            document.newPage();
            PdfImportedPage page = writer.getImportedPage(pdfReader, pageNumber);
            writer.addPage(page);
        }
    }
} catch (DocumentException e) {

} catch (IOException e) {

} finally{
    if(document != null) document.close();
    try{
        if(outputZipStream != null) outputZipStream.close();
        if(writer != null) writer.close();
    }catch(IOException e){

    }
}

Please, rest assure that I have look at this code for a long time, and try rewrite it many times to reduce the amount of memory it using. After the OutOfMemoryError, there are still lots of pdf files that have not been added 2-3 extra pages, so I think it is inside addBackerAndAffidavit, however, I try to close every resources I opened, but it still exception out. Please help.

Thang Pham
  • 38,125
  • 75
  • 201
  • 285
  • Have you tried to use a memory profiler while it is running? – Shane Wealti Sep 26 '11 at 19:30
  • 2
    I searched on "pdfwriter+flush" here on SO and found this: http://stackoverflow.com/questions/1260895/merging-1000-pdf-thru-itext-throws-java-lang-outofmemoryerror-java-heap-space You may find its accepted answer helpful. – BalusC Sep 26 '11 at 19:34
  • @ShaneWealti: I did, but for some reasons, Eclipse Helios Profiler tool after running a while always make Eclipse become unresponse. Probably it runs out of memory. – Thang Pham Sep 26 '11 at 19:35
  • @BalusC: Actually, `PdfCopy#freeReader(PdfReader)` does the trick. Thank you very much for the link, BalusC. – Thang Pham Sep 26 '11 at 20:01
  • You're welcome. I'm only uncertain if this question should be closed as dupe (technically, it is), or you just delete this question, or just answer the question yourself, or I should repost it as an answer as well (although I have almost no practical iText experience). What do you want, Harry? – BalusC Sep 26 '11 at 20:05
  • @BalusC: Please repost the answer, BalusC. I will accept it. – Thang Pham Sep 26 '11 at 20:09

1 Answers1

4

You need to invoke PdfWriter#freeReader() by end of every loop to free the involved PdfReader. The PdfCopy#freeReader() has this method inherited from PdfWriter and does the same. See also the javadoc:

freeReader

public void freeReader(PdfReader reader)
                throws IOException

Description copied from class: PdfWriter
Use this method to writes the reader to the document and free the memory used by it. The main use is when concatenating multiple documents to keep the memory usage restricted to the current appending document.

Overrides:
freeReader in class PdfWriter

Parameters:
reader - the PdfReader to free

Throws:
IOException - on error

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555