7

I am very new to both stackoverflow and Regexes so please forgive mistakes.

I have been searching thoroughly for a Regex to match all text that is not between curly brackets {} and from that text find certain words. For example from the following string:

$content = 'Hello world, { this } is a string with { curly brackets } and this is for testing'

I would like the search for word this to return only the second occurrence of this because its in the area which is not inside curly brackets. Even if I can get a Regex to match the substrings outside the curly brackets, things get simplified for me. I found this Regex /(}([^}]*){)/ but it cannot select the parts Hello world, and and this is for testing because these are not inside }{ and it only selects is a string with part.

Also I would like to ask if it is possible to combine two Regex for a single purpose like mine. For example the first Regex finds strings outside {} and second finds specific words that are searched for.

I want to use this Regex in php and for now I am using a function which is more like a hack. The purpose is to find specific words that are not in {} ,replace them reliably and write to text files.

Thanks in advance for your help.

Shark
  • 257
  • 2
  • 13

2 Answers2

6

(*SKIP)(*F)

You're in luck, as php's PCRE regex engine has a syntax that is wonderful for this kind of task. This tidy regex work like a charm (see demo):

{[^{}]*}(*SKIP)(*F)|\bthis\b

Okay, but how does it work?

Glad you asked. The left side of the alternation | matches complete {braces}then deliberately fails, after which the engine skips to the next position in the string. The right side matches the this words you want, and we know they're the right ones because they weren't matched by the expression on the left...

How to use it in PHP

Just the usual, something like:

$regex = "~{[^{}]*}(*SKIP)(*F)|\bthis\b~";
$count = preg_match_all($regex,$string,$matches);

You'll want to have a look at $matches[0]

Further reading about this and similar exclusion techniques

This situation is very similar to this question about "regex-matching a pattern unless...", which, if you're interested and enjoyed (*SKIP) power, you might like to read to fully understand the technique and how it can be extended.

Community
  • 1
  • 1
zx81
  • 41,100
  • 9
  • 89
  • 105
  • This works perfect. I knew there had to be something like (*SKIP) (*F) but I just didn't know it. And strangely no similar question on stackoverflow yet. At least I couldn't find one. Thanks – Shark Jun 12 '14 at 05:53
  • @Shark Thanks, glad you like it! :) Hey btw, I notice you haven't yet voted on StackOverflow. For any answer you find helpful, please consider voting up as this is how the reputation system works. No obligation, of course! Thanks for listening to my 10-second SO rep tutorial. :) – zx81 Jun 12 '14 at 05:58
  • 1
    @Shark Btw you'll find other SO questions with this technique, but yeah, maybe not at the top of google. For the full story I recommend you have a look at [the linked question](http://stackoverflow.com/questions/23589174/match-or-replace-a-pattern-except-in-situations-s1-s2-s3-etc/23589204#23589204), or save it for later, I had a lot of fun writing the answer. :) – zx81 Jun 12 '14 at 05:59
  • Thanks I will definitely read all about it as I spent two days trying to figure it out :D . And I didn't upvote anyone yet because I didnt have 15 reputation which is required to upvote. I have often found solutions here that I have upvoted in my heart :) – Shark Jun 12 '14 at 06:01
  • 1
    @Shark You have 18 rep now, and you might be on your way to the moon. :) That was a great question, not a rare one, but I particularly liked that you had given it a fair try yourself. Good on you for upvoting in your heart, that's brilliant. Hope to see you again sometime. :) – zx81 Jun 12 '14 at 06:04
  • 1
    well explained answer! btw the link "regex-matching... unless..." meant to point to regex101? shouldn't it point to [this answer from you](http://stackoverflow.com/a/23589204/3110638) which is very nice indeed and from where I learned about [The Greatest Regex Trick Ever](http://www.rexegg.com/regex-best-trick.html) cheers :] – Jonny 5 Jun 12 '14 at 07:09
1

With strings not very long, I'd use simple string manipulation functions to make these searchable

$content = 'Hello world, { this } is a string with { curly brackets } and this is for testing';

function searchify($stack,$charStart='{',$charEnd='}') {
  $searchArea = '';
  $first = explode($charStart,$stack);
  foreach ($first as $string) {
    list($void,$ok) = (strpos($string,$charEnd) ? explode($charEnd,$string) : array('',$string));
    $searchArea.= $ok;
  }
  return $searchArea;
}

this returns a cleared string, then strtr...

$replacing = array
 ('with'=>'this',
  "\n"=>'<br>',
  '  '=>"<br>",);
$raw = searchify($content);
$replaced = strtr($raw,$replacing);
var_dump($replaced);

...to replace values in it.

  • I will try your answer later today. I wonder which is faster, Regex with Preg_replace(), or your function with str_replace? Like I mentioned in the question, my purpose is to do a replace on all such matches in long strings. And a regex using \bword\b for word breaks and put in preg_replace is seems perfect as its a one line solution. – Shark Jun 12 '14 at 06:13
  • I do believe the fastest will probably be regexes, though this is easily implemented. How long are your strings? – Félix Adriyel Gagnon-Grenier Jun 12 '14 at 06:17
  • The strings are about the length same as my question or more. – Shark Jun 12 '14 at 06:21
  • I updated and completed that for the sake of it, but now I feel I understood wrongly some part of your question; the remainder string must still contain the original characters between the braces... – Félix Adriyel Gagnon-Grenier Jun 12 '14 at 07:45