2

With a string like "HorsieDoggieBirdie", is there a non-capturing regex replace that would kill "Horsie" and "Birdie", yet keep "Doggie" intact? I can only think of a capturing solution:

s/(Horsie)(Doggie)(Birdie)/$2/g

Is there a non-capturing solution like:

s/Horsie##Doggie##Birdie//g

where ## is some combination of regex codes? The specific problem is in JavaScript (innerHTML.replace) but I'll take Perl suggestions, too.

Marcel Korpel
  • 21,536
  • 6
  • 60
  • 80
MrSparkly
  • 627
  • 1
  • 7
  • 17
  • 3
    Why are you trying to avoid captures? – Mark Thomas Nov 09 '10 at 12:21
  • 2
    Well, the obvious answer to your question as you ask it now is `s/HorsieDoggieBirdie/Doggie/g`, but I assume that's not what you *really* wanted to ask? – JanC Nov 09 '10 at 12:24
  • The Horsies and Birdies are a simplified example. The real string is obnoxiously long and needs 3 complex regex's to identify the first, middle and end parts. The string is big, so I'm trying to avoid captures which I associate (maybe incorrectly) with a memory hit and extra processing. – MrSparkly Nov 09 '10 at 13:02

2 Answers2

5

You don't have to capture the Horsie or the Birdie.

s/Horsie(Doggie)Birdie/$1/g;

A similar thing should work for Javascript as well. This is probably as efficient as it gets, and at least as fast as using look-around assertions; although you should benchmark it if you want to know for sure. (The results, of course, will depend on the horsies, doggies and birdies in question.)

Mandatory disclaimer: you should know what happens when you use regular expressions with HTML...

Community
  • 1
  • 1
mscha
  • 6,509
  • 3
  • 24
  • 40
  • Nice. The real string is quite big, which is why I was trying to avoid any captures and the associated memory hits. I think I can live with one :) – MrSparkly Nov 09 '10 at 13:15
4

You can use Look-Around Assertions:

s/(?:Horsie(?=Doggie))|(?:(?<=Doggie)Birdie)//g;
Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378