3

I am making an interface for searching a fairly limited dictionary of words (2700 or so entries). The words are stored in an XML file thus:

<root>
    <w>aunt</w>
    <w>active volcano</w>
    <w>Xamoschi</w>
</root>

It is fairly basic - the user enters a string, and any matches are spit back out. The problem came when I wanted to include a wildcard character. If a user enters a string with asterisks, each asterisk is replaced by a regex to match zero or more characters, which can be anything.

So, when the user hits search, the script cycles through the XML tags and matches each nodeValue against the pattern srch:

var wildcardified = userinput.replace(/\*/g, ".*?");
var srch = new RegExp(wildcardified, "gi");

//for loop cycles through the xml, and tests with this:
if (srch.test(tag[i].firstChild.nodeValue) {
    //it's a match!
}

For the most part, it works as I'd hoped. But I'm getting some inconsistent results that I can't explain. For the values in the XML tags above, this is what happens with various inputs:

  1. a* matches all three
  2. a*n matches aunt and active volcano
  3. a*t only matches aunt
  4. a*ti only matches active volcano

Should #3 not also match the act in active volcano?

I see the same kind of results with other similar sets of words. I've tried to isolate the specific issue, but I can't for the life of me figure out what it is.

The Question: Can someone explain why #3 is not returning "active volcano", and what I can do to fix such behaviour?

Incidentally, I want it to be non-greedy, but just in case that was the issue, I tested both with and without the ?. Both returned the same inconsistent results above.

David John Welsh
  • 1,564
  • 1
  • 14
  • 23

1 Answers1

4

It's the g modifier in new RegExp(wildcardified, "gi"); that is causing you trouble. For an explanation and a workaround see Why does the "g" modifier give different results when test() is called twice?

Community
  • 1
  • 1
Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
  • Fantastic!!! Setting `srch.lastIndex = 0` after the `if` statement solved the problem instantly. – David John Welsh Jan 20 '12 at 15:30
  • @DavidJohnWelsh, yeah. That property of `test` is probably a spec error, but it's not going away anytime soon. – Mike Samuel Jan 20 '12 at 15:32
  • Either way it's interesting to know. The fact that it's weird makes it memorable, so hopefully it won't trip me up again. Anyway, there is no way I would ever have figured out what the problem was on my own, so thank you kindly! Saved me a headache and a half. – David John Welsh Jan 20 '12 at 15:54