I am trying to match only PHP code, such as the php code in this block:
<?php foo(); ?>
<abc>
<? foo(); ?>
<?php
foo();
bar();
?>
foo();
bar();
<? //also short open tag
foo();
bar();
?><?php
foo();
bar();
I want it to match only code that is between the php tags, including php open tag with closing tag and including only php open tag without closing tag (as can happen at the very end of php code).
I tried many regex options, finally ended up with this, but it obviously doesn't work as I want as it is in /g
mode, and also selects the <abc>
while it shouldn't (Demo):
<\?.*[\s\S]*?(?:$|\?\>)
Is there any way to achieve this with regex in /gm
mode?
Please note that the reason I am asking is because I am using a file search program and when I am searching the content of the many php files I have, I want it to search only inside php code and not come up with results that are irrelevant. So I will use this regex as an additional condition to the rest of the content search. The search program uses PCRE /gm
mode.
P.S. Before posting the question, I have done a lot of research on SO and could not find the solution to this question. Among other questions, I have also checked:
My regex is matching too much. How do I make it stop?
Get content between two strings PHP
Single regex to find string between two strings or started with single string only
Conclusion
I ended up using Julio's solution and improving it to also take into account single and double quotations marks as mentioned in the example in Jan's answer. Thank you all for your answers.
This is the final regex that works in /gm
mode:
<\?[\s\S]*?(?:\z|\?\>|[\"\'].*?[\"\'][\s\S]*?\?>)