0

Here's the reduced case of PHP code:

use Package;
use Package2;

class {
    use Trait;

    function fn() {
       function() use ($var) {

       }
    }
}

I'd like to match only the use before Package; and Package2; not use Trait nor use ($var)

Nothing like negative lookahead and negative lookbehind seem to work. Tried this approach Regular Expression, match characters outside curly braces { }

Obviously doesn't work: https://regex101.com/r/L6N4Ye/1

Using the PCRE interpreter.

Cid
  • 14,968
  • 4
  • 30
  • 45

2 Answers2

1

While using regex might not be the best choice here. You could use one if you have control over the format of the code you are parsing. Otherwise, using a PHP parser would be the best idea.

With that in mind, how about checking if the use is at the beggining of the string (^) ?

^use\s+(?![^{]*\})

see here

Nicolas
  • 8,077
  • 4
  • 21
  • 51
  • Saw this approach before and I was wondering if there are no edge cases other than poor code formatting? – Jakub Mikita Feb 07 '20 at 14:07
  • It depends on your context, if you have access to the code, you could simply reformat it so the `use` package outside class have no space before them. If not, then you might have to use something other than regex. Cid was mentionning a php parser, which is a wonderful idea. @JakubMikita – Nicolas Feb 07 '20 at 14:09
  • @JakubMikita No regex will help you, see [how this solution fails](https://regex101.com/r/L6N4Ye/3) – Wiktor Stribiżew Feb 07 '20 at 14:09
0

I am not aware of the PHP syntax, so please forgive me for missed syntactical considerations.

Since in this particular case, you are sure that all uses you are interested in lie before the class boundary, I think what may help is to look for all use that is not preceded by a {, which can be achieved through the following regex which uses a negative lookbehind for {:

(?<!\{\s{0,100})\s*use\s*(?<pkg>.*);

After applying this to the entire source code, you may look for the groups named pkg in the matched substrings.

However, the not-so-good part in the negative lookbehind is the \s{0,100}, which I have included only to allow spaces after the opening brace. There must be a better way for this. I had to do this because negative lookbehinds need a calculatable maximum length, due to which \\s* will not work.

My assumptions on the syntax:

  1. use is always small case
  2. A use package statement ends with ; necessarily
  3. Whitespace is allowed freely between tokens as in the case of Java
Sree Kumar
  • 2,012
  • 12
  • 9