27

OK regex question , how to extract a character NOT between two characters, in this case brackets.

I have a string such as: word1 | {word2 | word3 } | word 4

I only want to get the first and last 'pipe', not the second which is between brackets. I have tried a myriad of attempts with negative carats and negative groupings and can't seem to get it to work.

Basically I am using this regex in a JavaScript split function to split this into an array containing: "word1", "{word2 | word3}", "word4".

Any assistance would be greatly appreciated!

BenMorel
  • 34,448
  • 50
  • 182
  • 322
shaun stewart
  • 273
  • 1
  • 3
  • 5

2 Answers2

39

Try using this pattern

/\|(?![^{]*})/g

with this text

word1 | {word2 | word3 } | word 4 | word 4 | {word2 | word3 }

This should match all of the Pipe symbols that are not inside {}.

*edit - removed link to dead site (Thanks Dennis)

Diver
  • 1,568
  • 2
  • 17
  • 32
22

Depends on the language/implementation you're using, but...

\|(?![^{]*})

This matches a pipe that is not followed by a } except in the case that a { comes first.


The (?! ... ) is known as a negative lookahead assertion. This is easier to understand if we start with a positive lookahead assertion:

\|(?=[^{]*})

The above only matches a pipe that is followed by a } without encountering a { first. When you negate that by replacing the = with a !, the match is now only successful if there's no way for the positive case to be true (also known as the complement).

Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145