3

I'm attempting to match all text between {brackets}, however not if it is in quotation marks: For example:

$str = 'value that I {want}, vs value "I do {NOT} want" '

my results should snatch "want", but omit "NOT". I've searched stackoverflow desperately for the regex that could perform this operation with no luck. I've seen answers that allow me to get the text between quotes but not outside quotes and in brackets. Is this even possible?

And if so how is it done?

So far this is what I have:

preg_match_all('/{([^}]*)}/', $str, $matches);

But unfortunately it only gets all text inside brackets, including {NOT}

  • 1
    This question is off-topic because it is asking about **possibilities** and **is not a concrete coding question**. If you want to know if something is possible you should **research it** and **attempt to implement it**. If you have issues while doing this you then can ask a **specific** question, **showing the code you have written**, your expected results, and your actual results. – John Conde Nov 17 '13 at 22:32
  • 1
    Sorry if I was unclear. Question edited – Obadyah Anthony Nov 17 '13 at 22:42
  • @ObadyahAnthony Do you want to support single quotes ? I do have some mysterious tricks to make it work in one go :) – HamZa Nov 17 '13 at 22:49
  • yes, let me see what you got – Obadyah Anthony Nov 17 '13 at 22:57

2 Answers2

6

It's quite tricky to get this done in one go. I even wanted to make it compatible with nested brackets so let's also use a recursive pattern :

("|').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}

Ok, let's explain this mysterious regex :

("|')                   # match eiter a single quote or a double and put it in group 1
.*?                     # match anything ungreedy until ...
\1                      # match what was matched in group 1
(*SKIP)(*FAIL)          # make it skip this match since it's a quoted set of characters
|                       # or
\{(?:[^{}]|(?R))*\}     # match a pair of brackets (even if they are nested)

Online demo

Some php code:

$input = <<<INP
value that I {want}, vs value "I do {NOT} want".
Let's make it {nested {this {time}}}
And yes, it's even "{bullet-{proof}}" :)
INP;

preg_match_all('~("|\').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}~', $input, $m);

print_r($m[0]);

Sample output:

Array
(
    [0] => {want}
    [1] => {nested {this {time}}}
)
Community
  • 1
  • 1
HamZa
  • 14,671
  • 11
  • 54
  • 75
  • @ObadyahAnthony You're welcome, be careful of sentences that has misleading single quotes like `it's a test and {this} won't match`. – HamZa Nov 17 '13 at 23:08
  • 2
    @HamZa the only downfall, which is common is if the case is ever nested quotes. – hwnd Nov 17 '13 at 23:10
  • @hwnd indeed, which is maybe why the OP only wanted double quotes in the first place. – HamZa Nov 17 '13 at 23:12
  • 4
    @HamZa Yes, I was just making the comment for the OP to realize that if this case ever happens, it won't match as it is hard to match for nested quotes also without a balanced recursion. =) – hwnd Nov 17 '13 at 23:13
3

Personally I'd process this in two passes. The first to strip out everything in between double quotes, the second to pull out the text you want.

Something like this perhaps:

$str = 'value that I {want}, vs value "I do {NOT} want" ';

// Get rid of everything in between double quotes
$str = preg_replace("/\".*\"/U","",$str);

// Now I can safely grab any text between curly brackets
preg_match_all("/\{(.*)\}/U",$str,$matches);

Working example here: http://3v4l.org/SRnva

jszobody
  • 28,495
  • 6
  • 61
  • 72