1

I have to loop over a list of words and match them against a regex and I'm getting too many matches (looking for ALPHA also matches ALPHABET for example). Here is my script:

function evaluateChunk(my_chunk, my_input) {
  var rows = my_chunk.split(LINE_BREAKS).filter(Boolean),
    word_string = my_input.replace(/ /g, "|"),
    re = new RegExp("(" + word_string + ")(?:\\([0-9]\\))?"),
    output_dict = {"error_list": [], "match_dict": {}},
    row_len = rows.length,
    candidate,
    j;

  for (j = 0; j < row_len; j += 1) {
    candidate = rows[j].split(" ")[0];
    if (candidate.match(re) !== null) {
      output_dict.match_dict[candidate] = rows[j].split("]").pop().trim();
    }
  }
  output_dict.error_list = word_string.split("|").reduce(function (arr, word) {
    if (output_dict.match_dict[word] === undefined) {
      arr.push(word);
    }
    return arr;
  }, []);

  return output_dict;
}

my_input will be something like ALPHA BETA, which will be converted to ALPHA|BETA and put into the regular expression (which needs to catch both ALPHA AND ALPHA(2) hence the regex). Rows is a chunk from a dictionary I'm trying to look inputs up from.

My issue is that the regular expressions should:

ALPHA    => match (ok)
ALPHA(2) => match (ok)
ALPHABET => no match (doesn't work, is also returned)
ALPHANUMERIC => no match (doesn't work, is also returned)

Question:
How do I make a regular expression non greedy to only return exact word matches and no words with the same ... root?

frequent
  • 27,643
  • 59
  • 181
  • 333
  • Try `re = new RegExp("\\b(" + word_string + ")\\b(?:\\([0-9]\\))?")`. If you need to match the whole string, try `re = new RegExp("^(" + word_string + ")(?:\\([0-9]\\))?$")` – Wiktor Stribiżew Mar 14 '17 at 08:54
  • cool. worked. Want to make it answer so I can check? Thanks you very much. – frequent Mar 14 '17 at 09:01
  • Word boundaries worked? Or anchors? Anyway, the question is a dupe. – Wiktor Stribiżew Mar 14 '17 at 09:05
  • ah, I did not see your edit. your first solution with the word boundaries worked. If it's a dupe, please vote to close, but I've stumbled over the same issue a few times and never managed to find a good answer on SO. – frequent Mar 14 '17 at 09:07
  • Or [whole word match in javascript](http://stackoverflow.com/questions/2232934/whole-word-match-in-javascript). As I say, it is still a duplicate. Thee are a lot of questions on how to match a whole word/string on SO. Search for a "word boundary regex" on Google. – Wiktor Stribiżew Mar 14 '17 at 09:07
  • I see. Thanks for chipping in. – frequent Mar 14 '17 at 09:08

0 Answers0