3

How does PHP array_keys do the search for value?

Example:

$array2 = array("xyz", "xyz", "abc", "abc", "xyz", "xyz", "text", "abc", "text");

print_r(array_keys($array2,"abc"));

Since they are key,value pairs. I am guessing PHP to do a search based on hash rather than iterate through each of the key-pairs in the array.

Any 'clarity thoughts' on this?

Question inspired by this question: How to get the keys of empty elements in an array if the corresponding element in a similar-sized array is a number (without iteration)?

Community
  • 1
  • 1
ThinkingMonkey
  • 12,539
  • 13
  • 57
  • 81
  • Why should PHP do a search based on hash, when all hashes need to be returned? That's the point I didn't understood with your question. – hakre Dec 28 '11 at 17:55
  • I was looking at the array_keys when a `$search_value` was specified. – ThinkingMonkey Dec 28 '11 at 17:57
  • Sure, but as the function is documented (and from what is logically), it first needs to get all keys, the search is performed optionally. So only after you have all keys you can search for a value. So how could a search for hash be helpful here? - The hash has been already obtained. – hakre Dec 28 '11 at 17:59
  • I did not know that. And I do not have clarity on the symbol table as well. So my question may have sprung up due to my lack of knowledge & confusion. – ThinkingMonkey Dec 28 '11 at 18:01

3 Answers3

5

In the php source, they iterate through each key and value, one by one. https://github.com/php/php-src/blob/master/ext/standard/array.c#L2439

/* {{{ proto array array_keys(array input [, mixed search_value[, bool strict]])
   Return just the keys from the input array, optionally only for the specified search_value */
PHP_FUNCTION(array_keys)
{
    zval *input,                /* Input array */
         *search_value = NULL,  /* Value to search for */
         **entry,               /* An entry in the input array */
           res,                 /* Result of comparison */
          *new_val;             /* New value */
    int    add_key;             /* Flag to indicate whether a key should be added */
    char  *string_key;          /* String key */
    uint   string_key_len;
    ulong  num_key;             /* Numeric key */
    zend_bool strict = 0;       /* do strict comparison */
    HashPosition pos;
    int (*is_equal_func)(zval *, zval *, zval * TSRMLS_DC) = is_equal_function;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a|zb", &input, &search_value, &strict) == FAILURE) {
        return;
    }

    if (strict) {
        is_equal_func = is_identical_function;
    }

    /* Initialize return array */
    if (search_value != NULL) {
        array_init(return_value);
    } else {
        array_init_size(return_value, zend_hash_num_elements(Z_ARRVAL_P(input)));
    }
    add_key = 1;

    /* Go through input array and add keys to the return array */
    zend_hash_internal_pointer_reset_ex(Z_ARRVAL_P(input), &pos);
    while (zend_hash_get_current_data_ex(Z_ARRVAL_P(input), (void **)&entry, &pos) == SUCCESS) {
        if (search_value != NULL) {
            is_equal_func(&res, search_value, *entry TSRMLS_CC);
            add_key = zval_is_true(&res);
        }

        if (add_key) {
            MAKE_STD_ZVAL(new_val);

            switch (zend_hash_get_current_key_ex(Z_ARRVAL_P(input), &string_key, &string_key_len, &num_key, 1, &pos)) {
                case HASH_KEY_IS_STRING:
                    ZVAL_STRINGL(new_val, string_key, string_key_len - 1, 0);
                    zend_hash_next_index_insert(Z_ARRVAL_P(return_value), &new_val, sizeof(zval *), NULL);
                    break;

                case HASH_KEY_IS_LONG:
                    Z_TYPE_P(new_val) = IS_LONG;
                    Z_LVAL_P(new_val) = num_key;
                    zend_hash_next_index_insert(Z_ARRVAL_P(return_value), &new_val, sizeof(zval *), NULL);
                    break;
            }
        }

        zend_hash_move_forward_ex(Z_ARRVAL_P(input), &pos);
    }
}
/* }}} */
goat
  • 31,486
  • 7
  • 73
  • 96
  • oh, okay. So you mean to say it has to happen in O(n)? and I am guessing this is because each value created has to have a zval of its own? – ThinkingMonkey Dec 28 '11 at 17:40
  • 1
    there is no better way to illustrate exactly how PHP does it :D – maček Dec 28 '11 at 17:41
  • 1
    ya its O(n). The reason though is more so because of the data structure they use internally, they don't maintain a list of which values map to which keys. So, you simply need to iterate all values. – goat Dec 28 '11 at 17:43
  • I thought so. & No, optimization for duplicate values in an array. – ThinkingMonkey Dec 28 '11 at 17:46
  • In case you're interested there's a description of the data structure they use [here](http://stackoverflow.com/questions/2350361/how-is-the-php-array-implemented-on-the-c-level) – goat Dec 28 '11 at 17:51
1

Please see the array_keys definition, take note also for the comments which explain it pretty well:

Return just the keys from the input array, optionally only for the specified search_value

and later on:

Go through input array and add keys to the return array

The definition as follows for PHP 5.3 on ext/standard/array.c:

   2427 /* {{{ proto array array_keys(array input [, mixed search_value[, bool strict]])
   2428    Return just the keys from the input array, optionally only for the specified search_value */
   2429 PHP_FUNCTION(array_keys)
   2430 {
   2431     zval *input,                /* Input array */
   2432          *search_value = NULL,  /* Value to search for */
   2433          **entry,               /* An entry in the input array */
   2434            res,                 /* Result of comparison */
   2435           *new_val;             /* New value */
   2436     int    add_key;             /* Flag to indicate whether a key should be added */
   2437     char  *string_key;          /* String key */
   2438     uint   string_key_len;
   2439     ulong  num_key;             /* Numeric key */
   2440     zend_bool strict = 0;       /* do strict comparison */
   2441     HashPosition pos;
   2442     int (*is_equal_func)(zval *, zval *, zval * TSRMLS_DC) = is_equal_function;
   2443 
   2444     if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a|zb", &input, &search_value, &strict) == FAILURE) {
   2445         return;
   2446     }
   2447 
   2448     if (strict) {
   2449         is_equal_func = is_identical_function;
   2450     }
   2451 
   2452     /* Initialize return array */
   2453     if (search_value != NULL) {
   2454         array_init(return_value);
   2455     } else {
   2456         array_init_size(return_value, zend_hash_num_elements(Z_ARRVAL_P(input)));
   2457     }
   2458     add_key = 1;
   2459 
   2460     /* Go through input array and add keys to the return array */
   2461     zend_hash_internal_pointer_reset_ex(Z_ARRVAL_P(input), &pos);
   2462     while (zend_hash_get_current_data_ex(Z_ARRVAL_P(input), (void **)&entry, &pos) == SUCCESS) {
   2463         if (search_value != NULL) {
   2464             is_equal_func(&res, search_value, *entry TSRMLS_CC);
   2465             add_key = zval_is_true(&res);
   2466         }
   2467 
   2468         if (add_key) {
   2469             MAKE_STD_ZVAL(new_val);
   2470 
   2471             switch (zend_hash_get_current_key_ex(Z_ARRVAL_P(input), &string_key, &string_key_len, &num_key, 1, &pos)) {
   2472                 case HASH_KEY_IS_STRING:
   2473                     ZVAL_STRINGL(new_val, string_key, string_key_len - 1, 0);
   2474                     zend_hash_next_index_insert(Z_ARRVAL_P(return_value), &new_val, sizeof(zval *), NULL);
   2475                     break;
   2476 
   2477                 case HASH_KEY_IS_LONG:
   2478                     Z_TYPE_P(new_val) = IS_LONG;
   2479                     Z_LVAL_P(new_val) = num_key;
   2480                     zend_hash_next_index_insert(Z_ARRVAL_P(return_value), &new_val, sizeof(zval *), NULL);
   2481                     break;
   2482             }
   2483         }
   2484 
   2485         zend_hash_move_forward_ex(Z_ARRVAL_P(input), &pos);
   2486     }
   2487 }
   2488 /* }}} */

The search that is performed is also explained on the PHP manual page for array_keys­Docs:

If the optional search_value is specified, then only the keys for that value are returned. Otherwise, all the keys from the input are returned.

See also the $strict parameter to influence how values are compared:

Determines if strict comparison (===) should be used during the search.

The two types of comparison == (Equal) and === (Identical) are documented as well.

hakre
  • 193,403
  • 52
  • 435
  • 836
0

I suspect this to be not so easy:

  • If you do NOT give the last parameter "strict" as true, PHP might need to walk the array, along the way converting values to a compareable type.
  • If you DO give "strict" as true, it might also be necessary to iterate the array to exclude wrong types
Eugen Rieck
  • 64,175
  • 10
  • 70
  • 92