1

I have the following function which I am using to loop through a directory of PDF files, and create a thumbnail for each:

<cffunction name="createThumbnails" returntype="Void" output="false">
    <cfscript>
        //  CONSTANTs
        var _PDF_PATH = APPLICATION.PDFSource & "\PDFs";

        //  Set defaults for private variables
        var _qPDFDir = QueryNew("");
        var _documentId = 0;
        var _sourceFilePath = "";
        var _sku = "";
        var _tempImageFilePath = "";
    </cfscript>

    <!--- Retrieve a list of file names in the directory of unprocessed PDF files --->
    <cfdirectory
        action="list"
        directory="#_PDF_PATH#"
        name="_qPDFDir"
        filter="*.pdf"
        type="file"
        sort="datelastmodified DESC"
        listinfo="name" />

    <!--- Loop through the list of file names in the directory of unprocessed non-PDF files --->
    <cfloop query="_qPDFDir" endrow="500">
        <cfset _sourceFilePath = _PDF_PATH & "\" & name />

        <cfif FileExists(_sourceFilePath) AND IsPDFFile(_sourceFilePath)>
            <cftry>
                <cfpdf
                    action="thumbnail"
                    source="#_sourceFilePath#"
                    destination="#APPLICATION.TempDir#"
                    format="png"
                    scale="100"
                    resolution="high"
                    overwrite="true"
                    pages="1" />

                <cfcatch>
                    <cfscript>
                        FileMove(
                            _sourceFilePath,
                            _PDF_PATH & "\NonFunctioning\"
                            );
                    </cfscript>
               </cfcatch>
            </cftry>

            <cfscript>
                _documentId =
                    REQUEST.UDFLib.File.getFileNameWithoutExtension(name);
                _tempImageFilePath =
                        APPLICATION.TempDir
                    &   "\"
                    &   _documentId
                    &   "_page_1.png";

                if  (FileExists(_tempImageFilePath)) {
                    _sku = getSkuFromDocumentId(_documentId);

                    if  (Len(_sku)) {
                        CreateObject(
                            "component",
                            "cfc.products.Product"
                            ).setClientId(
                                getClientId()
                                ).setId(
                                    _sku
                                    ).createThumbnails(
                                        sourcePath = _tempImageFilePath,
                                        deleteSourceFile = true
                                        );

                        FileMove(
                            _sourceFilePath,
                            APPLICATION.ProcessedPDFDir
                            );
                    }
                }
            </cfscript>
        </cfif>
    </cfloop>

    <cfreturn />
</cffunction>

Some of the code is not what I would like to do - i.e. moving files to the "NonFunctioning" directory but it is required by the business rules.

What I'm trying to figure out is, how can avoid using up memory when this function runs?

With endrow="500", it bombed with a java.lang.OutOfMemoryError: GC overhead limit exceeded error after processing about 148 files.

And I can see the memory just increase and increase when I watch the jrun.exe process in Task Manager (Windows).

Is there any way I can improve the performance of this function to prevent memory from being eaten up?

Eric Belair
  • 10,574
  • 13
  • 75
  • 116
  • 2
    It looks from like your VM is spending too much time collecting memory: http://stackoverflow.com/questions/4371505/gc-overhead-limit-exceeded . The "GC overhead limit exceeded" message is telling you it's spending too long doing full GC, not necessarily running out of memory completely. Do you have GC logging configured on your CF instance? I'd set that to run every second and examine the log. Use GCViewer to Visualise the logs: https://github.com/chewiebug/GCViewer – barnyr Jun 11 '13 at 16:08
  • Is this function calling itself with this line? ).createThumbnails( If so, you might have an infinite loop. – Dan Bracuk Jun 11 '13 at 16:19
  • 1
    Could you do the cfdirectory outside of the function and pass in the directory/filename to it. Then in your main script, have your cfdirectory run 50 files at a time and then redirect back to itself via a cflocation or javascript refresh until its done processing everything in the folder. It seems that you move your PDF's to a processed folder once they are done? So if you have 500 files in a folder, running 50 at a time, 10 passes should do it. Add in a cfabort when your cfdirectory stops returning results. – steve Jun 11 '13 at 16:22
  • @DanBracuk no, it's calling a function in a different Object. – Eric Belair Jun 11 '13 at 16:35

1 Answers1

0

You are creating your object in a loop, each time you create the object CF will hold onto the memory until your page is finished running and then release the memory.

  1. instantiate your object outside of the loop saving it in a variable and re-use the object inside the loop.
  2. do not process all of the files in a single pass (as steve suggested) process in blocks and then create a new request (e.g cfhttp) and call your processing page again, this will release any memory used in the first pass.
garyrgilbert
  • 477
  • 5
  • 11