1

I am looking for a regex, that extracts everything after a given text (here, "HIT") until the next pipe symbol "|" which is NOT surrounded by "[[" and "]]".

Example-Text:

text | HIT = [[t1|t2]] moretext [[more braces]] | moretext | moretext

This is the regex I've tried:

HIT[ \t]*=(.[^\|]+)

Of course, this only returns "HIT = [[t1" but I am looking for a regex that returns "HIT = [[t1|t2]] moretext [[more braces]]".

Thanks for your kind support

Christian

itsame69
  • 1,540
  • 5
  • 17
  • 37
  • Are your non-bracketed, 'terminating' pipes always surrounded by whitespace as in the example here? – Tim Nov 17 '10 at 17:49
  • I was looking for a solutions that works with PHP. Gumbos solution (see below) works perfectly with PHP. – itsame69 Nov 17 '10 at 18:19

1 Answers1

2

Try this regular expression:

HIT[ \t]*=((?:[^[|]|\[\[[^[\]]*]])*)

The (?:[^[|]|\[\[[^[\]]*]])* part matches

  • any sequence of either any character except [ and | ([^[|]), or
  • a sequence of any character except [ and ] that is surrounded by [[…]] (\[\[[^[\]]*]]).
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • Gumbo, not only that the response was extremely fast it also is exactly what I was looking for and works perfectly! Thanks a lot for your help... – itsame69 Nov 17 '10 at 18:18
  • @itsame69: You’re welcome. Accepting my answer as *the* answer would also make it perfect for me. :) – Gumbo Nov 17 '10 at 19:27
  • hmmm... If you tell me how, I definitely would do that ;-). Sorry, I'm absolutely new here at stackoverflow. – itsame69 Nov 18 '10 at 07:52
  • Ok, I think I've just figgured out how to do that ;-) – itsame69 Nov 18 '10 at 07:57