5

I'm working on the MIU system problem from "Gödel, Escher, Bach" chapter 2.

One of the rules states

Rule III: If III occurs in one of the strings in your collection, you may make a new string with U in place of III.

Which means that the string MIII can become MU, but for other, longer strings there may be multiple possibilities [matches in brackets]:

  • MIIII could yield
    • M[III]I >> MUI
    • MI[III] >> MIU
  • MUIIIUIIIU could yield
    • MU[III]UIIIU >> MUUUIIIU
    • MUIIIU[III]U >> MUIIIUUU
  • MUIIIIU could yield
    • MU[III]IU >> MUUIU
    • MUI[III]U >> MUIUU

Clearly regular expressions such as /(.*)III(.*)/ are helpful, but I can't seem to get them to generate every possible match, just the first one it happens to find.

Is there a way to generate every possible match?

(Note, I can think of ways to do this entirely manually, but I am hoping there is a better way using the built in tools, regex or otherwise)

(Edited to clarify overlapping needs.)

Colin Dabritz
  • 861
  • 8
  • 19
  • Are you asking how to get _all_ the occurrences of III anywhere in a string, including overlapping matches, and having them returned in an array? – Ray Toal Aug 03 '13 at 05:13
  • Yes, as per the examples. e.g. MIIII should have two matches, for M[III]I and MI[III]. (These go on to become MUI and MIU). Exec looked promising, but it won't handle overlapping cases. – Colin Dabritz Aug 03 '13 at 05:21
  • `RegExp.exec` _can_ work for you, but you'd need to play with match objects, as in Kolink's answer. In this case, `indexOf` is probably better, but regexes are certainly more general. – Ray Toal Aug 03 '13 at 05:37
  • Hofstadter spent some time in Oregon too, didn't he? – Andrew Cheong Aug 03 '13 at 05:38

2 Answers2

12

Here's the regex you need: /III/g - simple enough, right? Now here's how you use it:

var text = "MUIIIUIIIU", find = "III", replace "U",
    regex = new RegExp(find,"g"), matches = [], match;
while(match = regex.exec(text)) {
    matches.push(match);
    regex.lastIndex = match.index+1;
}

That regex.lastIndex... line overrides the usual regex behaviour of not matching results that overap. Also I'm using a RegExp constructor to make this more flexible. You could even build it into a function this way.

Now you have an array of match objects, you can do this:

matches.forEach(function(m) { // older browsers need a shim or old-fashioned for loop
    console.log(text.substr(0,m.index)+replace+text.substr(m.index+find.length));
});

EDIT: Here is a JSFiddle demonstrating the above code.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
  • Hrm, why +1, not -1? This doesn't appear to solve my problem, I'm still poking at it. I'll report back with what I find. – Colin Dabritz Aug 03 '13 at 05:27
  • I'm not sure why you'd think `-1` would work. You need to move *forward* in the string, not backward. Moving backward would repeatedly find the exact same match and give you an infinite loop. Edited a JSFiddle link into the answer. – Niet the Dark Absol Aug 03 '13 at 05:28
  • Missread the above, I was editing the last index directly, which turns out to be much less useful. This appears to work, thanks much! – Colin Dabritz Aug 03 '13 at 05:31
  • Nice, did some googling on how to do regexp searches in JavaScript accounting for overlapping matches. Probably looked for 30min until I found this post with the simple `regex.lastIndex = match.index + 1` solution. Thanks! – BobuSumisu Sep 04 '13 at 03:44
2

Sometimes regexes are overkill. In your case a simple indexOf might be fine too!

Here is, admittedly, a hack, but you can transform it into pretty, reusable code on your own:

var s = "MIIIIIUIUIIIUUIIUIIIIIU";
var results = [];
for (var i = 0; true; i += 1) {
    i = s.indexOf("III", i);
    if (i === -1) {
        break;
    }
    results.push(i);
}
console.log("Match positions: " + JSON.stringify(results));

It takes care of overlaps just fine, and at least to me, the indexOf just looks simpler.

Ray Toal
  • 86,166
  • 18
  • 182
  • 232
  • Yes, this case is fairly simple. They get a bit more complex, and ideally it would be essentially open ended. – Colin Dabritz Aug 03 '13 at 05:37
  • Agreed. You will want something more powerful for the Mx → Mxx rule :) – Ray Toal Aug 03 '13 at 05:38
  • 1
    Yes, fortunately the regexes handle the Mx -> Mxx case nicely: `newTheorems.push( startingTheorem.replace( /(M)(.+)/i, "$1$2$2" ) );` – Colin Dabritz Aug 03 '13 at 05:41
  • Aaaaccctually, it's even easier than that: `function Rule2(s) {return s + s.substring(1);}` because in the MIU-puzzle, _all_ strings start with `M`! – Ray Toal Aug 03 '13 at 05:47
  • True, nice use of built in tools. This is helping me learn javascript regular expressions though. I already know C ;) – Colin Dabritz Aug 03 '13 at 05:57