7

How can I find all PHP variables with preg_match. I made the following regular expression:

$string = 'Hallo $var. blabla $var, $iam a var $varvarvar gfg djf jdfgjh fd $variable';
$instring = array();
preg_match_all('/\$(.*?)/', $string, $instring);
print_r($instring);

I just don't understand how regular expressions work.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
botenvouwer
  • 4,334
  • 9
  • 46
  • 75
  • 2
    This will work in terms of `regex` but not in terms of `PHP`. Sample: `$%#` will be captuted - but it's not valid php variable. You may restrict to alphanumeric `$([\w\d]+)` - but then `${'foo'}` will fail check. Conclusion - it's a _bad idea_ to try implement syntax parse with regex – Alma Do Oct 24 '13 at 10:10
  • @Alma Do Mundo: "Thanks" to non-greedy matching here, the zero-characters match for the star repetition already does it. The bad idea note is actually quite worth it, I could have put that angle into my answer as well and probably provide a link to [PHP Parser](https://github.com/nikic/PHP-Parser). – hakre Oct 24 '13 at 10:26
  • 1
    [`token_get_all()`](http://php.net/manual/en/function.token-get-all.php) can also do it. Just filter results by `T_VARIABLE` – nice ass Oct 24 '13 at 10:28

4 Answers4

22
\$(.*?)

Is not the right regular expression to match a PHP variable name. Such a regular expression for a Variable Name is actually part of the PHP manual and given as (without the leading dollar-sign):

[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*

So in your case I'd try with:

\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)

instead then. See the following example:

<?php
/**
 * Find all PHP Variables with preg_match
 *
 * @link http://stackoverflow.com/a/19563063/367456
 */

$pattern = '/\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)/';

$subject = <<<'BUFFER'
Hallo $var. blabla $var, $iam a var $varvarvar gfg djf jdfgjh fd $variable
BUFFER;

$result = preg_match_all($pattern, $subject, $matches);

var_dump($result);
print_r($matches);

Output:

int(5)
Array
(
    [0] => Array
        (
            [0] => $var
            [1] => $var
            [2] => $iam
            [3] => $varvarvar
            [4] => $variable
        )

    [1] => Array
        (
            [0] => var
            [1] => var
            [2] => iam
            [3] => varvarvar
            [4] => variable
        )

)

If you'd like to understand how regular expressions in PHP work, you need to read that in the PHP Manual and also in the manual of the regular expression dialect used (PCRE). Also there is a good book called "Mastering Regular Expressions" which I can suggest for reading.

See as well:

hakre
  • 193,403
  • 52
  • 435
  • 836
  • Thanks for quick response, I want to learn how regular expressions work so thanks again for leading me in the right direction. BTW your solution works. – botenvouwer Oct 24 '13 at 10:19
  • @sirwilliam: I also beautified the demo example a bit and made it actually part of the answer. As you can see it is linked, so you can use it to play around with it for learning about how PCRE works :) Playing is also a good way to learn. – hakre Oct 24 '13 at 10:23
  • I know, play all day with php. REXEG is a uninhabited place in my brain. I shall try to explore it. – botenvouwer Oct 24 '13 at 10:27
1

Thank you very much for the answers, which helped me a lot.

Here's an elaborated version of the regex, expandig the finds to array-variables with at least numeric indices an a preceding logical negation:

function get_variables_from_expression($exp){
    $pattern = '/((!\$|\$)[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*[0-9\[\]]*)/';
    $result = preg_match_all($pattern,$exp,$matches);
    return($matches[0]);
}

$example = '(($aC[5][7][1]xor((((!$a[5]&&!$a[4])&&!$a[3])&&!$a[2])&&$aC[6][6][0]))xor$aC[6][6][2])';
$list = get_variables_from_expression($example);

foreach($list as $var){
    echo "$var<br>";
}

results in:

$aC[5][7][1]
!$a[5]
!$a[4]
!$a[3]
!$a[2]
$aC[6][6][0]
$aC[6][6][2]
1

Thanks for the answer by hakre. I combined it with a little bit more to also match PHP array variables and object oriented variables:

\$(([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*(->)*)*(\[[^\]]*\])*)

That should match the following variables:

$this->that
$this["sure"]["thing"][0]
$var
$_GET["id"]
Jeff Baker
  • 1,492
  • 1
  • 12
  • 15
0

To find all variables (including array variables) you can use regex (perl):

\$([a-zA-Z_\x7f-\xff]*)\[([a-zA-Z]{1}[a-zA-Z0-9_]{1,32})\]

then if you want to replace them including single quotes you replace them by:

\$$1['$2']

ShopDev
  • 1
  • 1