0

I would like to match phrases like this:

  • having the same issue
  • facing the same problem
  • have the same question
  • I am getting the same issue
  • I see the same issue
  • I have same issue

But I do not want to match them if they are in the past tense, which means for example that anything containing the word had should be excluded:

  • I had the same issue
  • have had the same question

Later, I will add other words in past tense.

I tried this regex, but it still matches "the same issue" even if preceded by the word "had"

((?:i\s)?(?:have\s)?(?<!had\s)(?:(?:the\s|a\s)?same\s(?:(?:problem|question|issue)|here)))

https://regex101.com/r/Nvjtqj/1

Why is this regex still finding phrase "same issue" even if it contains word "had" in front of it?

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • 1
    The `?` on your `(?:(?:the\s|a\s)` group allows the negative lookbehind to occur at `same` so for `had the same issue` it matches, but it won't match `had same issue` https://regex101.com/r/8vwOTI/1 – Nick Mar 15 '20 at 22:35
  • @Nick That looks worthy to be down below (also). – Funk Forty Niner Mar 15 '20 at 22:39
  • @FunkFortyNiner I think it's covered pretty well by oriberu answer – Nick Mar 15 '20 at 22:53
  • @Nick Yeah I saw it too and upvoted. I wonder why it was downvoted though. I thought it was well-explained. I had a comment about it but deleted it. – Funk Forty Niner Mar 15 '20 at 23:04

2 Answers2

5

You need to exclude all matches with the verb in Past tense you want and then match what you need:

(\b(?:i\s+)?(?:have\s+)?)(?:had|faced)\s+((?:the\s+)?same\s+(?:problem|question|issue|here))(*SKIP)(*F)|(?1)(?2)

See the regex demo

Details

  • (\b(?:i\s+)?(?:have\s+)?)(?:had|faced)\s+((?:the\s+)?same\s+(?:problem|question|issue|here))(*SKIP)(*F) - (*SKIP)(*F) will make the regex engine drop the text matched with the following patterns and go on looking for a match at the failed location:
    • (\b(?:i\s+)?(?:have\s+)?) - Group 1:
      • \b - word boundary
      • (?:i\s+)? - an optional group matching an i and then 1+ whitespaces
      • (?:have\s+)? - an optional group matching a have and then 1+ whitespaces
    • (?:had|faced) - had or faced
    • \s+ - 1+ whitespaces
    • ((?:the\s+)?same\s+(?:problem|question|issue|here)) - Group 2:
      • (?:the\s+)? - an optional group matching a the and then 1+ whitespaces
      • same\s+ - same and 1+ whitespaces
      • (?:problem|question|issue|here) - one of the words in the group
  • | - or match and return the following match:
    • (?1) - Group 1 pattern repeated
    • (?2) - Group 2 pattern repeated
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • That's an interesting approach; I haven't seen recursion in a regex in ages and I don't think I have ever seen skip and fail. I learned something today, thanks. :) – oriberu Mar 15 '20 at 23:05
  • This actually reminded me of an older pattern: `no|(yes)` where you match what you don't want and what you do want, but capture only the latter. Then you iterate over all matches but ignore those without captures. Very similar in spirit. ^^ – oriberu Mar 15 '20 at 23:44
  • This works if it is on its own. When I included this as part of my larger regex like so: `()|()|()` it doesn't work anymore. Is there any way to keep it as part of my large regex or do I need to execute regex for each pattern and then merge results together? – Dharman Mar 15 '20 at 23:58
  • 1
    @Dharman If you have another regex, it is another question. Put all "exceptions" at the start of regex. Adjust the recursion group IDs accordingly. – Wiktor Stribiżew Mar 16 '20 at 00:01
  • Not many of us know about those. Recommend you cite [How do (*SKIP) or (*F) work on regex?](https://stackoverflow.com/questions/24534782/how-do-skip-or-f-work-on-regex). – smci Dec 02 '20 at 10:44
1

When you don't anchor your lookarounds the regex engine will simply give up a word in order to make the expression match - 'the' in this case, since 'same' does not have the problem of being preceded by 'had'.

Note that this is stretching the limits of what you can and should do with one expression and entering the territory of multiple checks and parsers. If you need to do this with an expression, it could be something like:

^(?!.*\b(?:had)\b)(?=.*same (?:problem|question|issue)).*

where you make a positive and a negative assertion from the same fixed position.

oriberu
  • 1,186
  • 9
  • 6
  • This looked promising, but it is very greedy. I like the idea nonetheless. – Dharman Mar 15 '20 at 22:50
  • @Dharman It is greedy and for this specific example I actually wouldn't have needed the positive lookahead; I think I mistakenly superimposed a different kind of problem there. The explanation preceeding the expression is valid, though. ;) – oriberu Mar 15 '20 at 22:53