2

I am trying to build a (multiline) pattern for a linter, to catch cases where:

  1. There is a Text( declaration
  2. That is not followed by .appFont( declaration...
  3. ...before the next occurrence of } (end of function) or Text( (another Text declaration...)

After many hours on regex101 (and consulting gpt...) I got these 2:

Text\([\s\S]*?\)[\s\S]*?(?!\.appFont)

This just catches the part that's before the .appFont, but I want the entire catch to fail if .appFont is found...

Text\([\s\S]*?\)[\s]*?(?!appFont)[\s\S]\}

This just catches everything, ignoring appFont being in the sting entirely...

In the following example, only the 2nd case should be captured:

Text("blah") 
  .appFont(.body)
}

Text("blah") 
}

Text(
  "blah"
)
.appFont(.blah)
}

I tried to read about negative lookahead but I think I still somehow just use it wrong, or somehow cause it to be ignored when I add [\s\S] maybe?

Aviel Gross
  • 9,770
  • 3
  • 52
  • 62
  • A lot depends on the tool. A lot of tools evaluate regexes one line at a time, so there's no way to write a regex that can check the next line. (Consider that a regex would end up having to check the _entire_ file and handle deeply nested functions, etc) – JDB Apr 17 '23 at 13:30
  • Have you tried working with [negation](https://www.regular-expressions.info/charclass.html#negated)? Something [like this demo](https://regex101.com/r/99hlyp/1). Depends on exact requirements. – bobble bubble Apr 17 '23 at 13:32
  • 1
    Further worth to try with [this variant](https://regex101.com/r/xenhiQ/1) if the `.appFont` can occur later on but before the next `}`. – bobble bubble Apr 17 '23 at 13:45
  • I didn't know about negation being a thing! Your last variant looks like it should do it but it mistakingly captures if `Text` and `appFont` are in the same line like so: `Text("blah").appFont(.foo)`, can't tell why that happens from the regex you wrote? – Aviel Gross Apr 17 '23 at 14:02
  • 1
    @AvielGross Was my slip to put the lookahead behind `[^}]` it should be *before* [like this update](https://regex101.com/r/xenhiQ/2). – bobble bubble Apr 17 '23 at 14:18

1 Answers1

3

Using a negated character class together with a negative lookahead.

Text\([^)]*\)(?:(?!\.appFont)[^}])*}

See this demo at regex101 - A bit similar to tempered greedy token.

regex explanation
Text match the substring
\([^)]*\) match ( followed by any amount of non-) negated class up to next closing )
(?:(?!\.appFont)[^}])*} (?: non capturing group) repeated * any amount of times, containing:
(?!\.appFont) a neg. lookahead that checks in front of each non-} if substring \.appFont is not ahead - consumes on success each matching character up to }

Or alternatively use the lookahead assertion just once after closing ).

Text\([^)]*\)(?![^}]*?\.appFont)[^}]*}

Another demo at regex101 - Might even be a bit more efficient here.

regex explanation
Text match the substring
\([^)]*\) match ( followed by any amount of non-) up to the next closing )
(?![^}]*?\.appFont) neg. lookahead (condition): look if [^}]*?\.appFont is not ahead where [^}]*? matches lazily any amount of non-} up to the substring \.appFont
[^}]*} if the condition succeded (it's not ahead) consume any amount of non-} up to }
bobble bubble
  • 16,888
  • 3
  • 27
  • 46
  • Amazing! thank you! So why does the 2nd option has the nageted character `[^}]` twice? – Aviel Gross Apr 17 '23 at 15:42
  • @AvielGross Welcome! A lookahead is an assertion triggered [at certain positions](https://stackoverflow.com/a/59115730/5527985). In the second solution the lookahead-check `(?![^}]*?\.appFont)` is done after the closing parentheses `)`. If the condition succeeds, the matching `[^}]*}` proceeds. In the first solution the lookahead is contained in a repitition and fired at each position (can be costly/depening on input). – bobble bubble Apr 17 '23 at 15:47
  • 1
    ok that makes sense now thank you so much for the help and extra details!! I would vote twice if I could :) – Aviel Gross Apr 17 '23 at 16:41