2

So, here's what I'm trying to do, although I've been struggling with that for some time.

Let's say we have this input:

{{something|a}} text text {{another|one|with|more|items}}

What I'm trying to achieve:

[
    ["something", "a"],
    ["another", "one", "with", "more", "items"]
]

The simple way would be something like:

"{{something|a}} text text {{another|one|with|more|items}}".scan(/([^\|\{\}]+)/)

But this yields - quite predictably so - all the results in a single array (also note that I do not want "text text" in the results, just the items IN the curly braces):

[["something"], ["a"], [" text text "], ["another"], ["one"], ["with"], ["more"], ["items"]] 

I then tried doing it like (see script here):

\{\{(([^\|\{\}]+)\|?)+\}\}

But I must be doing something wrong.

Any help will be appreciated! :)

Dr.Kameleon
  • 22,532
  • 20
  • 115
  • 223
  • 1
    You can't get all capture values of some group in Ruby. There are always as many captures as the capturing groups in the pattern. So, `.scan(/{{(.*?)}}/).flatten.map{ |x| x.split("|") }` [seems to work](https://ideone.com/s6iYK0) here. – Wiktor Stribiżew Dec 17 '20 at 09:14
  • @WiktorStribiżew So simple and so straightforward. I should have though of approaching it like that... Thanks a lot! (Just post it as an answer and I will gladly accept it :)) – Dr.Kameleon Dec 17 '20 at 09:16

1 Answers1

3

You can't get all captured values of a repeated capturing group in Ruby. There are always as many captures as the capturing groups in the pattern.

Thus, you need to throw in some more code to get the expected output:

s = '{{something|a}} text text {{another|one|with|more|items}}'
p s.scan(/{{(.*?)}}/).flatten.map{ |x| x.split("|") }
# => [["something", "a"], ["another", "one", "with", "more", "items"]]

See the Ruby demo.

Note the {{(.*?)}} pattern matches a {{ substring, then any zero or more chars other than line break chars as few as possible and then }}, then .flatten turns the result into a string array, and then x.split("|") within a map call splits the found capturing group values with |.

NOTE: if there can be line breaks in between {{ and }}, add /m modifier, /{{(.*?)}}/m. Or, unroll the pattern for better efficiency: /{{[^}]*(?:}(?!})[^}]*)*}}/ (see Rubular demo).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563