0

Regex:

(?<lang2>this\s*is\s*a\s*test\s*string)|(?<lang1>test)

Sample text:

this is a test string

If you run those named capturing groups individually the results in a match group with value. But if you run it combine like written above, then it return 1 group instead of two. I need to capture both groups. So the out put groups should be like:

Matched Group 1: "this is a test string"

Matched Group 2: "test"

Zeeshan
  • 121
  • 1
  • 3
  • Because the `test` in `this is a test string` has already been consumed with `(?this\s*is\s*a\s*test\s*string)`. What are you doing? Please explain. Are you trying to build a dynamic regex and look for overlapping matches? – Wiktor Stribiżew Jul 03 '19 at 10:29
  • yes something like that to identify all string even if its consumed. – Zeeshan Jul 03 '19 at 10:41
  • Then please post the relevant part of your code to see what you are doing. I think you should iterate via all possible patterns to see if they match or not. – Wiktor Stribiżew Jul 03 '19 at 10:55

2 Answers2

1

In your pattern you use an alternation which will capture the whole string in the first capturing group and after that there is nothing to match anymore for alternating part.

You could nest the capturing groups instead of using the alternation.

(?<lang2>this\s*is\s*a\s*(?<lang1>test)\s*string)
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • I suspect OP might be building the pattern dynamically to find all matches for each alternative. This "nesting" won't help in this situation. – Wiktor Stribiżew Jul 03 '19 at 10:37
  • what if we don't know the position of word "test" – Zeeshan Jul 03 '19 at 10:41
  • @Zeeshan What exactly do you wish to accomplish? Can you add your code to the question? – The fourth bird Jul 03 '19 at 11:31
  • @WiktorStribiżew You might be right. Lets wait for some clarification. – The fourth bird Jul 03 '19 at 11:33
  • I'm using .Net library for regular expression and just using Regex.Match method which returns only one group. i mean i can use iteration for both groups individually and i have already done that, but i was looking to avoid that loop and use just Regex.Match method – Zeeshan Jul 03 '19 at 11:45
0

Zeeshan!

The Regex Engine Always Returns the Leftmost Match This is a very important point to understand: a regex engine always returns the leftmost match, even if a "better" match could be found later. When applying a regex to a string, the engine starts at the first character of the string. It tries all possible permutations of the regular expression at the first character. Only if all possibilities have been tried and found to fail, does the engine continue with the second character in the text. Again, it tries all possible permutations of the regex, in exactly the same order. The result is that the regex engine returns the leftmost match.

https://www.regular-expressions.info/engine.html