1

Consider the following

use namespace;

class name impl... {

    use Trait;
}

How would I go about it, if I would like to extract either the use from before the class definition or the one after? Well in the above example it would be simple enough, but if it should also work on an actual code file with multiple use in both places and maybe not even grouped together, but with other things in between and also with all line chars removed?

It's easy enough to get them all, but I want it to either stop when it reaches the class or begin from the class. Just can't seam to get anything to work correctly.

Lines, comments and literals is stripped, so these should not be taken into consideration.

Daniel B
  • 1,205
  • 14
  • 18
  • What do you mean by _"extract"_? It would be easier for us to help if we knew what you're trying to accomplish. – M. Eriksson Mar 20 '18 at 20:26
  • Trying to figure out how to extract class and trait names from use references in a php file. But to determain which is which, you must know on which side of the class definition it was extracted from. So my attempt is to extract first one and then the other. But to do this, I must have it stop or begin from the point where the class is defined. – Daniel B Mar 20 '18 at 20:31
  • Well avoiding $use is easy enough, I do this with other extractions. But yes, I should properly mention that I already use RegExp to stop all literals from the file, to make these searches easier. – Daniel B Mar 20 '18 at 20:33
  • There is no lines, those are stripped along with comments and literals. – Daniel B Mar 20 '18 at 20:34
  • [Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.](https://en.wikiquote.org/wiki/Jamie_Zawinski#Attributed) – Barmar Mar 20 '18 at 20:43
  • Use a parser. [H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) Seriously though, what you're trying to do probably isn't going to work the way you want it to. You'd be better off running a match against a list of known packages. – jmcgriz Mar 20 '18 at 20:51

2 Answers2

0

In order to detect use clauses occuring in the outermost scope, you need to remove all (nested) blocks of {...}. You cannot do it in a single expression (due to unlimited depth) but if you want you may apply block removal in a loop:

$s = <your code>;
$prev_s = "";
while ($s != $prev_s) {
    $prev_s = $s;
    $s = preg_replace('\{[^}]*\}','',$s);
}

Now you may collect the outer use clauses

$outer_uses = preg_match_all('\buse\s+(\w+)', $s);
AndreyS Scherbakov
  • 2,674
  • 2
  • 20
  • 27
0

Removing blocks is not bad, but I need to get them all, incl the ones in the inner scope, I just need to know which is which.

This is apparently to much for a single RegExp to handle, so this is what I did. Just in case others reading this is looking for an answer.

Use something to locate the offset starting position of the class declaration.

/\b(class|interface|trait)\s+[\w]+.*{/s

This combined with preg_match and it's PREG_OFFSET_CAPTURE flag will provide you with the offset.

Then extract all of the use clauses, inner and outer scope.

/\buse\s+(?<full>([\w\\\]+(?:\s+as\s+[\w]+)?(?:\s*[,]\s*)?)+)(?:(?<=[\\\]){(?<inline>.*)})?\s*[;{]/

Use this with preg_match_all, using PREG_SET_ORDER|PREG_OFFSET_CAPTURE, which will include the offset of each match.

Now simply compare each offset with the one extracted in the beginning. If it's lower, it's a reference clause. If it's higher, it's a trait clause.

Daniel B
  • 1,205
  • 14
  • 18