0

I am trying to parse strings of the form

{{name1 | filter1|filter2 |filter3}} into (name1, filter1, filter2, filter3).

I have a RegEx:

static const regex r("\\{\\{\\s*([\\.\\w]+)(\\s*\\|\\s*[\\.\\w]+)*\\s*\\}\\}");

And I want to find all occurences of the second group, which marked with a Kleene star (...)*. The problem is that I can only find last occurrence of the group.

Instead I use the following RegEx:

static const regex r("\\{\\{\\s*([\\.\\w]+)((\\s*\\|\\s*[\\.\\w]+)*)\\s*\\}\\}");

To find the second capture group (whole substring " | filter1|filter2 |filter3") and parse it with another RegEx.

How can it be done in C++?

The most similar question is here: Regex: Repeated capturing groups

Cœur
  • 37,241
  • 25
  • 195
  • 267
Sergey Palitsin
  • 131
  • 1
  • 6
  • So, you want to replace `{{` by `(`, `}}` by `)` and `|` by `,`. Why such complex regex. – Tushar Jan 26 '16 at 14:53
  • Сергей, did you consider using raw string literals? Backslashes are not regex best friends. Use `R"(PATTERN_HERE)"`. C++ std::regex does not support such a thing as C# CaptureCollection. Match the whole substring and then split/parse. It is easier. I'd use [`std::regex r(R"(\{\{([^}]*(?:}(?!})[^}]*)*)\}\})");`](https://ideone.com/T6F7D2). – Wiktor Stribiżew Jan 26 '16 at 14:56
  • Alternatively, you can use [Boost regex library `match_results::captures`](http://www.boost.org/doc/libs/1_33_1/libs/regex/doc/captures.html). – Wiktor Stribiżew Jan 26 '16 at 15:12
  • Thank you Wiktor. Boost captures seem like the thing I was looking for. – Sergey Palitsin Jan 26 '16 at 15:55

1 Answers1

0

You need to add () around the "*" expression meant to match the second group.

(\s*\|\s*[\.\w]+)*

Here, the () group matches 1 instance of: SP | SP WORD Even though the "*" matches zero or more instances of that. Change it to:

((\s*\|\s*[\.\w]+)*)

Or, to be clear that the inner () isn't a tagged expression:

((?n:\s*\|\s*[\.\w]+)*)
joeking
  • 2,006
  • 18
  • 29