0

My goal is to regex a multiline string. The problem is that it captures empty line.

The regex: ^(a)?$

The string:

a

The result:

Match 1 a
Group 1 a
Match 2 null
Group 1 null
Jason Rich Darmawan
  • 1,607
  • 3
  • 14
  • 31
  • Remove the quantifier? – Mad Physicist Aug 26 '22 at 18:25
  • 1
    If `a` is optional between start and end anchors then by design it will allow empty string. What is your requirement? – anubhava Aug 26 '22 at 18:25
  • @anubhava Noted. My requirements are `regex that can group a stringified table which have empty data on several row, column`. [Here is the previous SO's question that contain the full requirements](https://stackoverflow.com/questions/73502078/is-it-possible-for-regex-to-recognize-whether-a-column-is-a-string-or-an-int/73502188#73502188). Anyway, is there an alternative quantifier? – Jason Rich Darmawan Aug 26 '22 at 18:33
  • Then you can try: `^(?!$)(a)?$` – anubhava Aug 26 '22 at 18:36
  • @anubhava it throws error "the preceding token is not quantifiable" – Jason Rich Darmawan Aug 26 '22 at 18:41
  • @anubhava sorry my bad, Golang use RE2, negative lookahead operator is not supported. https://stackoverflow.com/questions/47211017/regex-expression-negated-set-not-working-golang – Jason Rich Darmawan Aug 26 '22 at 18:46
  • ok in golang try: `^(?:(a)|.+)$` – anubhava Aug 26 '22 at 18:48
  • That is as per the spec. You want to match & capture `a` or else match 1+ char – anubhava Aug 26 '22 at 18:56
  • 1
    @anubhava It works. I modify the first line to `abc` and modify the regex to `^(?:([a-z]+)|.+)$`. Thank you. – Jason Rich Darmawan Aug 26 '22 at 19:00
  • @anubhava after double checking the capturing group `([a-z]+)` should have quantifier, because the requirements are `table which can have empty data on several row, column` -> `([a-z]+)?`. If you add that, it will capture the empty line again. – Jason Rich Darmawan Aug 26 '22 at 19:04
  • No you don't add `?` because your desired match is on LHS of `|` and is available in capture group. On RHS is 1+ of any char not in any capture group that is outside your match – anubhava Aug 26 '22 at 19:06

1 Answers1

1

Converting my comment to answer so that solution is easy to find for future visitors.

You may use this regex in golang to prevent this behavior:

^(?:([a-z]+)|.+)$

This will match & capture 1+ of lowercase characters or else match any 1+ char but don't allow empty match.

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643