1

Given a string, eg.:

static_string.name__john.id__6.foo__bar.final_string

but with an arbitrary number of label__value. components, how can I repeat the capture groups, split them into label & value, and also capture the terminating final_string ?

For the above I'd want [name, john, id, 6, foo, bar, final_string]

Is something like this possible when I don't know the number of label__value. components in advance?

This is for golang / RE2 if that matters.

Update: I don't have the luxury of doing this in a few lines of code, and would need to do this in a single regex. The regex is defined in a config file to an application I don't control, so a code based loop with conditionals etc is unfortunately not possible.

Dean Taylor
  • 40,514
  • 3
  • 31
  • 50
Danielle M.
  • 3,607
  • 1
  • 14
  • 31
  • I'm not versed in Go this is fairly simple in java, you determine a capturing group and then loop through the results, so looking for similar stuff in go I found this, it may be helpful: https://stackoverflow.com/questions/30483652/how-to-get-capturing-group-functionality-in-golang-regular-expressions – Jorge Campos Jul 26 '19 at 17:35
  • When you say "arbitrary number of `label__value`" is there a maximum number of occurances? – Dean Taylor Jul 26 '19 at 18:49
  • @DeanTaylor yes, probably in the region of 10 ish – Danielle M. Jul 26 '19 at 22:50

1 Answers1

1

This totally depends on what the thing you are putting this into expects.

This is answer focused on getting you the capture groups in a basic way attempting to avoid any issues with the "thing" you are putting the regex into and RE2.

Note: You might find that the final_string doesn't get the capture group index you expect with this method, but again depends on what you are putting the regex into.

A regular expression that would match "one" and "no" key/value pairs the following is:

^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$
  • static_string.final_string
  • static_string.name__john.final_string

To support one more key/value pair we repeat part of the regular expression: Part repeated:

(?:\.([^.]+?)__([^.]+))?

So to support 2 key value pairs the regular expression is:

^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$

This now supports the following additional example:

  • static_string.name__john.foo__bar.final_string

So if I expand that out to support 12 key value pairs the regular expression is:

^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$

This supports the following additional examples:

  • static_string.name__john.id__6.foo__bar.final_string
  • static_string.name2_1b__john.id__6.foo__bar.final_string
  • static_string.name__john.id__6.foo__bar.name__john.id__6.foo__bar.name__john.id__6.foo__bar.name__john.id__6.foo__bar.final_string
Dean Taylor
  • 40,514
  • 3
  • 31
  • 50
  • I hit accept as this gets me way closer to what I wanted, but are you aware of a way to implement exactly this with an unknown number of key__value pairs? – Danielle M. Jul 26 '19 at 22:57
  • @DanielleM. I'm not, not without additional coding i.e. not just a regular expression. – Dean Taylor Jul 26 '19 at 22:58
  • @DanielleM. also don't feel the need to hit accept, you could leave it open perhaps someone else knows better than myself. Perhaps check back in another time. – Dean Taylor Jul 26 '19 at 23:00