0

When I print $online_performers variable I want to get a unique value for id 2. Do I need to convert them in standard array first or is that possible without it? (remove all duplicates).Please check my new code for this.

 Array
        (
            [0] => stdClass Object
            (
                [id] => 1
                [username] => Sample1
            )
            [1] => stdClass Object
            (
                [id] => 2
                [username] => Sample1
            )
           [2] => stdClass Object
            (
                [id] => 2
                [username] => Sample1
            )
           [3] => stdClass Object
            (
                [id] => 4
                [username] => Sample4
            )
        )



to
    Array
        (
            [0] => stdClass Object
            (
                [id] => 1
                [username] => Sample1
            )
            [1] => stdClass Object
            (
                [id] => 4
                [username] => Sample4
            )
        )
Rakhi
  • 929
  • 15
  • 41

3 Answers3

2

PHP has a function called array_filter() for that purpose:

$filtered = array_filter($array, function($item) {
    static $counts = array();
    if(isset($counts[$item->id])) {
        return false;
    }

    $counts[$item->id] = true;
    return true;
});

Note the usage of the static keyword. If used inside a function, it means that a variable will get initialized just once when the function is called for the first time. This gives the possibility to preserve the lookup table $counts across multiple function calls.


In comments you told, that you also search for a way to remove all items with id X if X appears more than once. You could use the following algorithm, which is using a lookup table $ids to detect elements which's id occur more than ones and removes them (all):

$array = array("put your stdClass objects here");

$ids = array();
$result = array();

foreach($array as $item) {
    if(!isset($ids[$item->id])) {
        $result[$item->id]= $item;
        $ids[$item->id] = true;
    } else {
        if(isset($result[$item->id])) {
            unset($result[$item->id]);
        }
    }
}

$result = array_values($result);
var_dump($result);
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • hi is this possible to remove duplicates totally? – Rakhi Aug 05 '14 at 13:22
  • You mean removing every item that occurs multiple times? – hek2mgl Aug 05 '14 at 13:27
  • @vishal I don't a *fancy* way. You would need to iterate over the array, count occurrences and remove those which are duplicates in a second loop afterwards. – hek2mgl Aug 05 '14 at 13:34
  • HI your first algorithm was working fine second one not working properly i placed my array as $array = array($online_performers); but its not working fine.In your fist algorithm it removes duplicate but it keeps 1 for eg if id=2 comes 3 times in array than it will replace two and keep 1 instead of that i want to delete all those item which are duplicate.Thanks. – Rakhi Aug 06 '14 at 05:32
  • @vishal Oh, did I missed something? ... Let me check this .. – hek2mgl Aug 06 '14 at 08:10
  • Thanks if you can check it.Also if its possible to remove all occurrences.Thanks very much – Rakhi Aug 06 '14 at 08:13
  • @vishal Yes, there was a bug. Now it should work as expected. – hek2mgl Aug 06 '14 at 08:24
  • $array = array($online_performers); $ids = array(); $result = array(); foreach($array as $item) { if(!in_array($item->id, $ids, true)) { $result[$item->id]= $item; $ids[]=$item->id; } else { if(isset($result[$item->id])) { unset($result[$item->id]); } } } $result = array_values($result); echo "
    ";
    print_r($result);
    – Rakhi Aug 06 '14 at 11:30
  • this code is not working for me as expected.Please Help @hek3mgl – Rakhi Aug 06 '14 at 11:31
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/58770/discussion-between-vishal-and-hek2mgl). – Rakhi Aug 06 '14 at 11:31
  • @vishal I've enhanced the second example, this improves performance. – hek2mgl Aug 06 '14 at 22:20
1

Let's say we have:

$array = [
   //items 1,2,3 are same
   (object)['id'=>1, 'username'=>'foo'],
   (object)['id'=>2, 'username'=>'bar'],
   (object)['id'=>2, 'username'=>'baz'],
   (object)['id'=>2, 'username'=>'bar']
];

Then duplication depends of what do you mean. For instance, if that's about: two items with same id are treated as duplicates, then:

$field = 'id';

$result = array_values(
   array_reduce($array, function($c, $x) use ($field)
   {
      $c[$x->$field] = $x;
      return $c;
   }, [])
);

However, if that's about all fields, which should match, then it's a different thing:

$array = [
   //1 and 3 are same, 2 and 3 are not:
   (object)['id'=>1, 'username'=>'foo'],
   (object)['id'=>2, 'username'=>'bar'],
   (object)['id'=>2, 'username'=>'baz'],
   (object)['id'=>2, 'username'=>'bar']
];

You'll need to identify somehow your value row. Easiest way is to do serialize()

$result = array_values(
   array_reduce($array, function($c, $x)
   {
      $c[serialize($x)] = $x;
      return $c;
   }, [])
);

But that may be slow since you'll serialize entire object structure (so you'll not see performance impact on small objects, but for complex structures and large amount of them it's sounds badly)

Also, if you don't care about keys in resulting array, you may omit array_values() call, since it serves only purpose of making keys numeric consecutive.

Alma Do
  • 37,009
  • 9
  • 76
  • 105
  • Thats deliciously elegant, but its checking username and he wants to check the [id] field. – RiggsFolly Aug 05 '14 at 12:06
  • @RiggsFolly I've made an option for that too (see first explanation) – Alma Do Aug 05 '14 at 12:07
  • Ah thanks missed that, now I just have to work out what it all means. – RiggsFolly Aug 05 '14 at 12:10
  • Hi alma thanks,in my case i have merged two arrays so i dont want to keep any other entery with same id for eg keep 1 and 2 only and remove other with id2 from the array you made (object)['id'=>1, 'username'=>'foo'], (object)['id'=>2, 'username'=>'bar'] – Rakhi Aug 05 '14 at 12:24
  • @vishal then it's my first sample (where I'm telling about - how to filter duplicates by `id` column) – Alma Do Aug 05 '14 at 12:24
  • I can't see where this should be elegant. The `serialize()` is overhead which brings no profit. `array_filter()` is the way to go here. – hek2mgl Aug 05 '14 at 12:27
  • Unused in second case (yep, my bad, copy+paste from first callback). But about "elegant" - I didn't say that. It will work in O(n) time like filtering. The difference is that filtering is happening by hash automatically. And for me - `array_filter()` has it's bad too (I mean, `static` var) - so, well, if you think that `array_filter()` is a way to go - fine, but, please, add "IMO" to the sentence – Alma Do Aug 05 '14 at 12:29
  • You say difference, I say disadvantage. IMO would indicate that it is something which depends on an opinion. This is not the case here. You are clearly wrong – hek2mgl Aug 05 '14 at 12:30
  • I can say something about using static vars - they do reduce readability. So _IMO_ static vars is disadvantage. You can not claim that "I'm wrong", it's just _wrong_ to say so. Because what is advantage and what isn't depends of opinion, especially when it comes to readability. And - yes, I don't see filtering via hash as a disadvantage. If you see - again, fine. But don't try to make _your opinion_ as the only right thing. – Alma Do Aug 05 '14 at 12:42
  • Seriously, you are `serialize` every object in the loop, then compare the serialized strings and after all reorganized the array - in constrast to me just looking `if(isset($count[$item->id]))` .. You are really still discussing? Why don't you just say: "Ok, thanks"? – hek2mgl Aug 05 '14 at 13:07
  • Did you read entire post? Or just "wait, it's serialize there, it's bad"? The thing I'm suggesting with serialize in _only for case, when whole equality is needed_ For case with removing duplicate by specific field it's just plain key usage, and I've also stated about when do we need `array_values()` call – Alma Do Aug 05 '14 at 13:13
  • Maybe you should have omitted the `serialized` part, even if you compare all properties, it would be better to implement a function like `MyObject::compare(MyObject $other);`, better because yours will not work: http://3v4l.org/CHQPm – hek2mgl Aug 05 '14 at 13:25
  • That "not work" isn't specified in question. And - yes, I assume that not only content of properties matters, but also order too. That's in my case a compare function. Is it good or not? Debatable. Point about properties order is at one side, but order itself - on another side. I've made an example, I've explained it. I've pointed to thing that it may be slow (and why). I can't imagine why you still insist on "bad point" here. – Alma Do Aug 05 '14 at 13:28
  • Ok. Just one another thing: Your code (even the first example) will remove the original, not the duplicate. This might likely not being the expected result. (The other answer is doing the same, I admit that I missed to criticize that before, even below the other answer). However, I think the question is from such a bad quality that it does not make sense to debat about this anymore. AFK – hek2mgl Aug 05 '14 at 17:28
  • 1
    And.. What? If they are duplicates, then by definition it doesn't matter which of them to keep – Alma Do Aug 05 '14 at 17:40
  • I expected this answer, but look at your first example, your are just checking for id equality, not for name equality. So it does indeed matter. – hek2mgl Aug 05 '14 at 20:07
  • Again and.. what? If only `id` matters and only id determines equality, all other properties are irrelevant in terms of compatison. Thus, once id matched, we're free to select any item (because they already are equal). In second sample, with so-hated `serialize()`, it really matters, as well as even declaration order - but that's intended. However, it may be.. ok: fine, if one needs to save exactly _first_ entry per `id`, just add `if(!isset($c[$x->id]))` check; whole concept won't change - but, again, I see no sense in that – Alma Do Aug 05 '14 at 20:10
  • Or you could just `array_reverse()` before and after calling `array_reduce()`. This would make it more flexible, you could choose the order. I don't expect `array_reverse()` being that expensive. Also I never say that I "hated" `serialize()`. I would like to see a benchmark, I could imagine situations where comparing the serialized object might be even faster than comparing every single property manually, especially if there are a lot. However, there should remain the question "why there are objects having *exactly* the same data, even id?". I can't find a valid real world scenario ... – hek2mgl Aug 05 '14 at 20:27
  • I doubt if it's a flaw exactly (because it's what nature of equality is - once we treat elements as equal, they all have same "right" to be claimed as duplicates). array_reverse() is also the way, but it will cause unnecessary memory overhead - so check in-place for existence of key will save that – Alma Do Aug 05 '14 at 20:31
1

If you don't care about changing your keys you could do this with a simple loop:

$aUniq = array ();
foreach($array as $obj) {
    $aUniq[$obj->id] = $obj;
}

print_r($aUniq);
hlscalon
  • 7,304
  • 4
  • 33
  • 40