13

I am in doubt what to use:

foreach(){
    // .....

    if(!in_array($view, $this->_views[$condition]))
        array_push($this->_views[$condition], $view);

    // ....
}

OR

foreach(){
    // .....

    array_push($this->_views[$condition], $view);

    // ....
}

$this->_views[$condition] = array_unique($this->_views[$condition]);

UPDATE

The goal is to get array of unique values. This can be done by checking every time if value already exists with in_array or add all values each time and in the end use array_unique. So is there any major difference between this two ways?

hakre
  • 193,403
  • 52
  • 435
  • 836
user1692333
  • 2,461
  • 5
  • 32
  • 64
  • Read the documentation, `array_unique` removes duplicate values within a given array... `in_array` provides a search into the array values and returns a true/false if found/not found – Daryl Gill Apr 10 '13 at 21:56
  • 2
    @DarylGill i know what this functions does, but i want to know which of provided examples is better – user1692333 Apr 10 '13 at 21:57
  • You have not provided enough information why you are stuck with these two functions, in what context are they being used etc – Daryl Gill Apr 10 '13 at 21:58
  • @DarylGill the goal is to get array of unique values. This i can do by checking every time if value already exists with `in_array ` or add all values each time and after use `array_unique ` – user1692333 Apr 10 '13 at 22:01
  • 2
    Back of the envelope tells me array_unique is better. It would be `O(n) + O(n log(n))` rather than `O(n^2)` for checking `in_array` each time – Matt Dodge Apr 10 '13 at 22:01

3 Answers3

15

I think the second approach would be more efficient. In fact, array_unique sorts the array then scans it.

Sorting is done in N log N steps, then scanning takes N steps.

The first approach takes N^2 steps (foreach element scans all N previous elements). On big arrays, there is a very big difference.

p91paul
  • 1,104
  • 10
  • 26
  • 1
    +1 for second approach. Here is a good script (snippet) with comparison in milliseconds: https://gist.github.com/Ocramius/7453564 – TroodoN-Mike May 10 '14 at 12:29
4

Honestly if you're using a small dataset it does not matter which one you use. If your dataset is in the 10000s you'll most definitely want to use a hash map for this sort of thing.

This is assuming the views are a string or something, which it looks like it is. This is typically O(n) and possibly the fastest way to deal with tracking unique values.

foreach($views as $view)
{
    if(!array_key_exists($view,$unique_views))
    {
        $unique_views[$condition][$view] = true;
    }
}
Anther
  • 1,834
  • 12
  • 13
0

TL;DR: foreach combined with if (!in_array()) is better.

Truthfully you should not really worry about what performs better; in most cases the difference is so small, its negligible (unless you're really doing some big data stuff). I would suggest to go with whatever seems more readable.

If you're interested, check out this script I wrote. It loops each case 100.000 times and both take between 50 and 200 ms.

https://3v4l.org/lkTCF

Note that array_unique() keeps the original keys so to counter that we also have to wrap the result with array_values().

In case the link ever dies:

<?php
$loops = 100000;

$start = microtime(true);
for ($l = 0; $l < $loops; $l++) {
    $x = [1,2,3,4,6,7,8,9];
    for ($i = 0; $i <= 10; $i++) {
        if (!in_array($i, $x)) {
            $x[] = $i;
        }
    }
}
$duration = microtime(true) - $start;
echo "in_array took $duration<br>".PHP_EOL;

$start = microtime(true);
for ($l = 0; $l < $loops; $l++) {
    $x = [1,2,3,4,6,7,8,9];
    $x = array_values(array_unique(array_merge($x, [0,1,2,3,4,5,6,7,8,9,10])));
}
$duration = microtime(true) - $start;
echo "array_unique took $duration<br>".PHP_EOL;

enter image description here enter image description here

Ken
  • 2,859
  • 4
  • 24
  • 26