2

I'm working through debugging some legacy code, and want to use a pre-built function that is essentially a wrapper for get_defined_vars().

Running this code directly in the calling file prints an array of variables as expected:

print_r(get_defined_vars());

However, wrapping this in a simplified version of my function prints an empty array:

function debugAllVars() {
    print_r(get_defined_vars());
}
debugAllVars();

Regardless of the scope, I would have expected "superglobal" variables such as $_POST to be present in the output.

Why is the output completely empty?

IMSoP
  • 89,526
  • 13
  • 117
  • 169
kchason
  • 2,836
  • 19
  • 25
  • 1
    _"It is not a method (inside a class) so I don't believe we'd be running into scope issues"_ - A function has its own scope as well. – M. Eriksson Dec 05 '17 at 16:03
  • This function returns a multidimensional array containing a list of all defined variables, be them environment, server or user-defined variables, **within the scope** that get_defined_vars() is called. – AbraCadaver Dec 05 '17 at 16:04
  • @MagnusEriksson, agreed, that was more in reference to the only other question I've found like this. Would the `_POST` not be in scope for child functions? – kchason Dec 05 '17 at 16:04
  • This is actually a rather interesting question, and my guess is that the "superglobals" (`$_FOO` and the magic `$GLOBALS` array) are implemented as a special case in the code, rather than being imported into each new function scope. This behaviour has apparently been stable since PHP 4.3: https://3v4l.org/dWr1Q – IMSoP Dec 05 '17 at 18:12

2 Answers2

5

get_defined_vars() prints all variables in the "symbol table" of the scope where it is called. When you try to wrap it as debugAllVars, you introduce a new scope, which has a new symbol table.

For a standalone function like this, the symbol table consists of:

  • the function's parameters
  • any global variables imported into the current scope using the global keyword
  • any static variables declared in the current scope with the static keyword (even if not assigned a value)
  • any variables implicitly declared by assigning a value to them
  • any variables implicitly declared by taking a reference to them (e.g. $foo = &$bar would implicitly declare $bar if not already defined; $foo = $bar would not)

Notably, this list does not include the superglobals, such as $_GET, $_POST, and $GLOBALS. If you run get_defined_vars() in global scope (i.e. outside any function), you will see that these are present in the symbol table there, which is also what the magic variable $GLOBALS points to. So, why are they not present in every scope, and how can we use them if they're not?

For this, we need to dig into the internals of the implementation, where these variables are referred to as "auto-globals" rather than "superglobals".

The answer to why is performance: the naive implementation of an "auto-global" would be one that acted as though every function automatically had a line at the top reading global $_GET, $_POST, ...;. However, this would mean copying all those variables into the symbol table before every function was run, even if they weren't used.

So instead, these variables are special-cased in the compiler, while converting your PHP code into the internal "opcodes" used by the VM which executes the code.

Using a source code browser, we can see how this works.

The key function is zend_is_auto_global in zend_compile.c (taken from current master, effectively PHP 7.2):

zend_bool zend_is_auto_global(zend_string *name) /* {{{ */
{
    zend_auto_global *auto_global;

    if ((auto_global = zend_hash_find_ptr(CG(auto_globals), name)) != NULL) {
        if (auto_global->armed) {
            auto_global->armed = auto_global->auto_global_callback(auto_global->name);
        }
        return 1;
    }
    return 0;
}

Here, name is the name of a variable, and CG means "compiler globals", so the main job of this function is to say "if the variable name given is in a compiler-global hash called auto_globals, return 1". The additional call to auto_global_callback allows the variable to be "lazy loaded", and only populated when it is first referenced.

The main usage of that function appears to be this conditional, in zend_compile_simple_var_no_cv:

if (name_node.op_type == IS_CONST && 
    zend_is_auto_global(Z_STR(name_node.u.constant))) {

    opline->extended_value = ZEND_FETCH_GLOBAL;
} else {
    opline->extended_value = ZEND_FETCH_LOCAL;
}

In other words, if the variable name you referenced is in the list of superglobals, the compiler switches the opcode into a different mode, so that when it is executed, it looks up the variable globally rather than locally.

IMSoP
  • 89,526
  • 13
  • 117
  • 169
3

get_defined_vars gets all variables defined in the scope that it's called in. Your debugAllVars function introduces a new scope, so get_defined_vars will at most give you all variables within debugAllVars. It cannot give you variables from the caller's scope.

Also see Reference: What is variable scope, which variables are accessible from where and what are "undefined variable" errors?.

deceze
  • 510,633
  • 85
  • 743
  • 889