0

First: I know this question has been beaten to death.

Second: I've looked at every resource (online editors, RexEgg, regular-expressions.info etc...) I can think of over the past few years and I still cannot grasp this part of regex. It never seems to work for me no matter what RegEx I use.

Now that the house-keeping is out of the way:

I have a large amount of text that I need to extract some data from that I think Regex is well-suited for.

The text looks like this:

2017-03-31 09:41:18 EDT [12708-4] parameters: $1 = '0', $2 = 'ON', $3 = 'ON'

Fairly obviously, I want the values for $1, $2 and $3. This particular example has 3 variables but it's generally between 1 and 15.

I want a regex that will capture the following:

  1. $1
  2. '0'
  3. $2
  4. 'ON'
  5. $3
  6. 'ON'

This is my regex, which matches the first group:

\d{4}.+\[[\d-]*\].+?parameters:\s((\$\d+)\s?=\s?(['\d+\w+]+))

but no combination of pluses, parentheses and commas produces anything near what I want. Even if I remove the commas from the string and just jam them together I can't get it to capture.

This guy captures everything, but the groups don't make sense:

\d{4}.+\[[\d-]*\].+?parameters:\s(((\$\d+)\s?=\s?(['\d+\w+]+),?\s?)+)

Can someone end my misery here and explain to me how to capture repeated text in a regex if the text is separated by characters that I don't care about?

Community
  • 1
  • 1
Brandon
  • 4,491
  • 6
  • 38
  • 59
  • 1
    You might want to take a look at `\G` which matches at the position, the previous match ended. `(?:\G(?!^),|parameters:)\s+(\$\d+) = '(\w+)'` should be somewhat what you are looking for. – Sebastian Proske Mar 31 '17 at 19:38
  • I couldn't get that one to work either. Still only captured one – Brandon Mar 31 '17 at 19:58
  • Try multiple matching/capturing Sebastian mentions with a bit modified pattern: [`(?:\G(?!^),|parameters:)\s*(\$\d+)\s*=\s*'([^']+)'\s*`](https://regex101.com/r/3nHI53/1). It won't give you 6 groups, because there can only be as many groups as there are defined in the pattern. It will give 3 matches x 2 groups in each. If this approach is not what you need, you just cannot get it with pure regex. What is the programming language? – Wiktor Stribiżew Mar 31 '17 at 20:17

1 Answers1

0

I would suggest you use come regex to extract each value set like this (\$\d) = '(.+?)' example

Then you can loop through all the matches. Group 1 will be the label (ie. $1) and Group 2 will be the value (ie. 0)


Just a small warning I think is worth mentioning: "A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data" (as noted on regex101)

jjspace
  • 178
  • 2
  • 11
  • 1
    There's a lot more unrelated stuff to the problem that prevents me from just capturing the two groups separately. I'm aware that it will only capture the last iteration. I tried getting around it by putting a capturing group around the repeated group but it didn't work. – Brandon Apr 03 '17 at 13:27