4

I've been seeing intermittent errors in a couple of systems I've been working on, when using the same methodology (not the same code) leading me to believe the problem may be linked to creating and using structs in the same request. I'm wondering if it's possible there's a race condition?

The scenario is this: We're on an e-commerce system, looking at a product, or in some cases a list of products. The code in question is designed to return the images associated with each product, in a struct that we can use for display of said images.

At the beginning of the request, the code looks for database records associated with the item in question. These records represent images for the product(s). These records are returned in a single CFQuery call (or more accurately, a call to a function which returns the results of a CFQuery call, shaped into a struct containing various info).

The code then loops through the supplied image struct, and adds various information to a Local struct. Later in the request we use the data in the struct to display the images in our <img> tags. We also populate the <img> tag with data- attributes for use with JavaScript.

In the case that any particular image was not correctly returned by the query - usually because the physical file is missing - we use a generic placeholder image. This is done by putting the struct creation inside a try/catch block.

Importantly: this works.

What's happening however, is that very intermittently, when referring to a node in the struct we've created, we find that it does not exist and CF throws an error - this happens maybe 1% of the time and reloading the same page, everything will work perfectly.

I've had this same problem on multiple systems, on multiple servers, on different versions of ColdFusion (8 & 10 to be specific) and using completely different code to achieve similar results. The first system I saw this issue on, actually used FileExists to check that the image file was available and thus I thought that the problem was probably caused by the bottleneck of the filesystem - I tried many ways around this and eventually eliminated it altogether in the new system - but the problem persists.

The only thing I can think of, is that when creating a struct and then using that struct later in the same request, there's a possibility that a race condition occurs; whereby I refer to a node in the struct before it's finished being created. I'm not using threading here though, so I can't really see how that's possible... I'm out of other ideas.

Some code is below to show what I'm doing, but given that the same issue arises on completely different systems, I think it's the methodology rather than the code that has a problem.

<!--- Get product images --->
<cfset Local.stProductImages = Application.cfcParts.getPartImages(
        l_iItemID = Arguments.pid
) />


<!--- Loop through images --->
<cfloop list="#ListSort(structKeyList(Local.stProductImages['item_' & Arguments.pid]), 'text')#" index="i">
    <cftry>
        <cfset Local['ImageURL_' & i & '_Large']    = Local.stProductImages['item_' & Local.arguments.pid][i].large_watermarked.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Large']    = Application.com.Images.getMissingImages().large />
        </cfcatch>
    </cftry>                        
    <cftry>
        <cfset Local['ImageURL_' & i & '_Med']      = Local.stProductImages['item_' & Local.arguments.pid][i].med.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Med']      = Application.com.Images.getMissingImages().med />
        </cfcatch>
    </cftry>                        
    <cftry>
        <cfset Local['ImageURL_' & i & '_Small']        = Local.stProductImages['item_' & Local.arguments.pid][i].small.URL />
        <cfcatch>
            <cfset Local['ImageURL_' & i & '_Small']        = Application.com.Images.getMissingImages().small />
        </cfcatch>
    </cftry>                        

    <img class          = "altProdImg<cfif i EQ 'image_03'> endImage</cfif>" 
        src             = "#Local['ImageURL_' & i & '_Small']#" 
        image           = "#i#" 
        alt             = ""
        data-imgsmall   = "#Local['ImageURL_' & i & '_Small']#"
        data-imgmed     = "#Local['ImageURL_' & i & '_Med']#"
        data-imglarge   = "#Local['ImageURL_' & i & '_Large']#"
        data-imgnum     = "#i#"
        data-pid        = "#Arguments.pid#"
    />
</cfloop>

The error occurs in the <img> tag, when referring to a node created in the preceding code - Something like:

Element ImageURL_image_02_Large is undefined in a Java object of type class coldfusion.runtime.LocalScope.

But only very occasionally... I'll reload and it'll work perfectly every time.

So... sorry for the epic length of question, but can anybody see how this could occur?

Gary Stanton
  • 1,435
  • 12
  • 28
  • 1
    http://varscoper.riaforge.org/ – Peter Boughton Nov 08 '13 at 13:07
  • Not sure I'm with you Peter - I thought var scoping was for functions? The functions that are being called (Application.cfcParts.getPartImages and Application.com.Images.getMissingImages) are both varscoped correctly - is there something I should be doing inside the .cfm as well?? – Gary Stanton Nov 08 '13 at 13:11
  • Also, just ran VarScoper on the file in question and it came back with nothing! Can't be certain I'm using it correctly though, have never used it before. – Gary Stanton Nov 08 '13 at 13:18
  • 1
    Are you sure the code is not being called from in a function? The behaviour you describe is symptomatic of not var scoping, and the error specifically points to Local scope (which only exists in functions). Also, you're referring to `Arguments.pid`? – Peter Boughton Nov 08 '13 at 13:29
  • you're using the `local` scope, which implies this code is within a function, as that's all the local scope is intended for – duncan Nov 08 '13 at 13:29
  • (VarScoper isn't perfect, and can't detect unscoped vars in included files - i.e. `` - which is something some frameworks do.) – Peter Boughton Nov 08 '13 at 13:30
  • yeah let's see the code in getMissingImages and getPartImages. Also you call getMissingImages () three times. Couldn't you call it once, store the results in a single struct and then refer to that struct's small/med/large as required? – duncan Nov 08 '13 at 13:32
  • 2
    It might be as simple to fix as using `index="local.i"` in the cfloop tag (you only need the scoping when writing the variable). – Peter Boughton Nov 08 '13 at 13:33
  • It's definitely not inside a function, no... There is an 'Arguments' scoped variable - unfortunately this is a legacy system and I'm just working with the variables I'm given! As for the Local scope, I've always used that inside .cfm files as well as functions. It's just a handy way of grouping the variables I create. Perhaps I should be using 'Variables' instead, but it's never caused a problem before... If there's a reason not to, I'd love to know more. – Gary Stanton Nov 08 '13 at 14:29
  • `getMissingImages` simply returns a private struct in the component. When it's instantiated, I pass the locations of the missing image files as a struct which is stored in the component's private Variables scope. Literally all the function contains is the following: `` Of course, I could just call this the once and store the result, but given there's no processing involved whatsoever, I didn't think there'd be an overhead. `getPartImages` returns the result of another function that deals with images. It's all properly scoped, I'm sure. I'll post if required. – Gary Stanton Nov 08 '13 at 14:31
  • @PeterBoughton - I'll give the local scope in the loop a go - may take a while to report back though since the problem is very intermittent. – Gary Stanton Nov 08 '13 at 14:33
  • Yeah, since CF8 the `local` scope is the name of where using the `var` keyword puts variables (actually, it's not quite that simple, but close enough). So I'd probably say avoid using just `local` as a normal variable to avoid confusion. (And I'd fix that `Arguments` one too.) – Peter Boughton Nov 08 '13 at 14:47
  • A relative easy way to check if you're in a function, without going through code: `` then check the stack trace - if you see stuff like `coldfusion.runtime.UDFMethod` or `$funcSTUFF.runFunction(filename:line)` then you know you're inside one. – Peter Boughton Nov 08 '13 at 14:47
  • Oh, and for a possible way to force the error, try a load testing tool like [JMeter](https://jmeter.apache.org/) - might help to make it occur more (and thus help verify when it's fixed/reduced). – Peter Boughton Nov 08 '13 at 14:49
  • Well, when writing functions, I generally tend to use `` at the top and store everything inside the Local scope after that... I'm pretty sure that was a *BenNadelism* and thus, utterly infallible. ;) Is this no longer good practice? Alas, I can't fix the Arguments variable, there's a bloody huge request path in this system and I'd have to change it in so many files it doesn't bare thinking about. I shouldn't be in a function, but then, this is a truly horrid system running through Fusebox, so who knows? I'll check. Cheers. – Gary Stanton Nov 08 '13 at 14:57
  • Aha... I *do* appear to be in a function... the .cfm I'm working on is a template that is included by a function (that's run by a template included in a function in a function in a function in a function in a... I hate this system). So, I guess we're onto something with this. I'll bet one of those functions hasn't varscoped local... – Gary Stanton Nov 08 '13 at 15:02
  • Inside a function, if you need to support old engines (CF7/OBD/etc) then `` is ok (there are some minor issues). If you're running on modern engines, it shouldn't be necessary. I was referring to not using it outside functions though. – Peter Boughton Nov 08 '13 at 15:02
  • If most of the functions start with the `var local` thing, this regex _might_ help find the cuplrit: `]+>(?:\s*]+>)*(?!\s* – Peter Boughton Nov 08 '13 at 15:06
  • My regex skills are non-existent, so I'm not certain what that's trying to do, but it returns hundreds of functions including those which are varscoped. Thanks anyway! The system was written for CF8 and is now on CF10, but I always figured it couldn't hurt. I've always used my own 'Local' scope outside functions though. Just seemed like the right place to put things. I don't relish the idea of changing that in the hundreds of thousands of files in which it's used!!! Is it really a problem? Any links I can follow for explanations of that? – Gary Stanton Nov 08 '13 at 15:14
  • Nah, not a big enough problem to go change everything, just better not to use built-in scope names for unscoped/root-level variables, since it's less confusing and avoids potential conflicts with the real scope (e.g. in a function with `` you can't get to the `url` scope.) So maybe use something else going forward, but probably no need to worry about going back and changing older stuff. – Peter Boughton Nov 08 '13 at 15:20
  • To be clear, I'm not talking about using 'Local' as a variable name... I'd never do that!! I just put everything in the `Local` scope, or sometimes the `Request` scope if it seems more applicable... Do other people just scope using `Variables` then? – Gary Stanton Nov 08 '13 at 15:24
  • When you say... _" I've always used my own 'Local' scope outside functions though."_ - there is no local scope outside functions, so that's using local as a variable name for a struct. Some people use `loc` as a short name that's different to `local` but still relatively obvious (of course, some people hate abbreviations in code, so will disagree with that). – Peter Boughton Nov 08 '13 at 15:27
  • Ahh, yes I see what you're saying... So 'my own Local scope' is actually an unscoped variable called 'Local'. Makes sense when you spell it out. Bugger. Well, always learning, ay? Ten bloody years I've been doing that! – Gary Stanton Nov 08 '13 at 15:30
  • So, I've updated the loop to use `Local.i` and after a few hours I've had no errors... which isn't to say the issue's resolved but it's promising. Thing is, as previously discussed, I'm not actually scoping the `i` by putting it in the Local scope, as the Local scope isn't a scope at all! So, why would this work? – Gary Stanton Nov 08 '13 at 17:29
  • Well, after a few days of no errors, I'm convinced the scoping issue was indeed the culprit - Adding 'Local' to the loop index seems to have fixed the issue. @PeterBoughton - if you want to submit that as an answer, I'll mark it accepted, for the virtual kudos... and thanks for the help! ;) – Gary Stanton Nov 11 '13 at 10:11

1 Answers1

3

Answer from comments...

The behaviour you describe is symptomatic of not var scoping, so it might be as simple to fix as using index="local.i" in the cfloop tag (you only need the scoping when writing the variable).


Side note: A relatively easy way to check if you're in a function, without going through code, is by throwing an error (i.e. <cfthrow message="where am i?" />) then check the stack trace - if you see stuff like coldfusion.runtime.UDFMethod or $funcSTUFF.runFunction(filename:line) you know you're inside a function (even when the template you're in shows no sign of it).

Peter Boughton
  • 110,170
  • 32
  • 120
  • 176