-3

I'm running into an issue with the following.

I need to clean up HTML before rendering it to the browser.

The current regex matches everthing like "{varname}" no problems so far, however I need to exclude matches which are found within script tags.

*Example was a bit unclear, so updated * Example:

<html>
<head></head>
<body>
this is an example `{var}` variable, <- this should be matched/removed
    <script>
    // don't match below arguments in other words don't let regex remove them/match them
    myMethod("{param1:'foo', param2:'bar'}");
    </script>
</body>
</html>
  • 2
    I'm not sure what you're asking here, but it sounds like you're parsing HTML with regex. [Don't do that](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags#1732454). Use an HTML parser. – maček Dec 01 '11 at 19:31
  • What are you trying to do? I'm a little confused here. – gen_Eric Dec 01 '11 at 19:36
  • Hi Macek, I'm not parsing html with regex, i'm cleanup up.. For your info, data is set by the parser of CodeIgniter, ( CI uses {} as params/vars) However i don't want {} in my output to the browser ( this happens when vars weren't defined ). So that's why i want to clean up, however i want to keep the brackets in the javascript ofcourse. – user1076168 Dec 01 '11 at 19:36
  • 2
    Wouldn't it be better to treat the root cause (undefined variables) rather than the symptom? – jprofitt Dec 01 '11 at 19:39
  • What's your current code? And just a guess: making it more specific would already avert matching contents with spaces, colons, quotes. – mario Dec 01 '11 at 19:42
  • @Mario i only have the match for the {} right now, which is pretty basic as in preg_replace("~{(.*?)}~i", "", $sContent); – user1076168 Dec 01 '11 at 19:43
  • just did mario.. but have to get used to stackoverflow.. tried to @ the preg_replace on a new line.. but enter posted immediate.. so that's why i've replied with an halve answer. – user1076168 Dec 01 '11 at 19:46
  • Excluding a section of html with regukar expressions is akin to matching a section of html. You'll run into a heap of trouble either way. – Herbert Dec 01 '11 at 19:48
  • To those who close this - the question is obvious!!! It was some of the responses that clouded the issue. I see EXACTLY what he needs done... its regular expressions dudes! sheesh. – JasonMichael Nov 14 '12 at 23:39

1 Answers1

1

Make it specific and just match alphanumeric characters:

preg_replace("~\{(\w+)}~i", "", $sContent); 

Would avoid the {x: 'y'} example problem already.


To exclude document parts with preg, use preg_replace_callback; list the undesired (<script>.+?</script>)|... as first alternative, then switch-handle in the callback.

mario
  • 144,265
  • 20
  • 237
  • 291
  • tnxx.. I was so stupid focused on the excluding script issue i misssed something easy as matching only alphanumeric.. Tnxx! – user1076168 Dec 01 '11 at 20:02