0

I have a page with much text. I want to find in that page any text which is located between 2 pairs of double braces ( {{}} ) and to move the whole block at the very top of the page. The problem is a bit more complex because the block limited by those double braces may contain other braces and pairs of braces, and in such cases I also need to do the move. It's necessary to capture only the block located between 2 pairs of double braces, including starting and closing pairs of double braces, but optionally may contain other braces (pairs or not, but symmetrical number of opening braces to closing ones).

Example with input text (http://pastebin.com/7JcA7Wku):

Lorem ipsum dolor sit amet...
{{
some text with or without other { } / {{}}, but not odd, only even number of braces
To be captured
}}

... Donec sed scelerisque erat.

Is this possible to do via regex?

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
XXN
  • 128
  • 9
  • Like [`(.*?)({{.*?}})` -> `$2$1`](https://regex101.com/r/yJ9nN4/1)? If not please illustrate what you need. – Wiktor Stribiżew Jul 15 '16 at 17:17
  • Or [`(.*?)({{(?>(?2)|(?:(?!}}).))*}})`](https://regex101.com/r/yJ9nN4/2)? Or perhaps, you need to prepend the pattern with `^` and use `while` to perform S&R until no change is done? – Wiktor Stribiżew Jul 15 '16 at 17:28
  • Which language are you using? – kennytm Jul 15 '16 at 17:35
  • @kennytm: not really a language. In AutoWikiBrowser with regex on Wikipedia :) – XXN Jul 15 '16 at 18:26
  • @XXN, So, according to https://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser/Regular_expression, it is using .NET (C#) syntax. – kennytm Jul 15 '16 at 18:28
  • Actually, the QTax solution in the linked answer is not correct. – Wiktor Stribiżew Jul 15 '16 at 18:36
  • @Wiktor Stribiżew: thank you for trying to help me! Suggested sollution from your first comment works somehow, but not exactly as I need. It just moves lead text after first pair of closing braces, but in many cases this is unacceptable for me because I have to move the text after the N-th pair of closing braces, where N is also the number of pair of opening braces. Second expression does not works in AWB - "Unrecognised grouping construct". – XXN Jul 15 '16 at 19:06
  • Try `(?s)(.*?)({{(?:(?!}}|{{).|(?{{)|(?<-open>}}))*(?(open)(?!))}})` --> `$2$1`. See [demo](http://regexstorm.net/tester?p=(%3fs)(.*%3f)(%7b%7b(%3f%3a(%3f!%7d%7d%7c%7b%7b).%7c(%3f%3copen%3e%7b%7b)%7c(%3f%3c-open%3e%7d%7d))*(%3f(open)(%3f!))%7d%7d)&i=111%7b%7baaa%7d%7d%0d%0a222%7b%7baa%7ba%7d%7d%7d%0d%0a333%7b%7bc%7b%7bc%7bc%7d%7d%7d%7d%0d%0a444&r=%242%241%0d%0a). – Wiktor Stribiżew Jul 15 '16 at 20:18

1 Answers1

2

Regular expressions can't count, and therefore can't match balanced sets of delimiters. Such a balancing requirement instantly disqualifies whatever language possesses it from being "regular".

There are enhanced pattern-matching systems based on regular expressions, like the one in Perl, which allow you to nest a regex recursively inside itself to do this sort of thing, but at that point you're no longer using regular expressions.

You need something that parses context-free languages, not regular ones.

Mark Reed
  • 91,912
  • 16
  • 138
  • 175
  • Can Python do this? – Y. zeng Aug 19 '22 at 01:51
  • See the various solutions at https://stackoverflow.com/questions/546433/regular-expression-to-match-balanced-parentheses. The python one is not regex-based and lives here: https://stackoverflow.com/questions/38211773/how-to-get-an-expression-between-balanced-parentheses/38212061#38212061 – Mark Reed Aug 19 '22 at 13:22
  • Is there a way to use it in Vscode for Latex file? – Y. zeng Aug 20 '22 at 05:07