4

If I have an array [1 2 3 4 3 5 6 7 8 7], I'd like to find the list of non-unique entries: [3 7]. I fail to find a simple way to do it. Any idea?

Update: I'd like to have a universal solution, which would also work with cell array of strings.

texnic
  • 3,959
  • 4
  • 42
  • 75
  • possible duplicate of http://stackoverflow.com/questions/5385651/determining-the-number-of-occurrences-of-each-unique-element-in-a-vector – gregswiss Sep 23 '15 at 11:12
  • 2
    @gregswiss: I need a list of duplicated elements rather than how often they are encountered. Besides, the solutions in the linked question are not applicable to non-numeric arrays. – texnic Sep 23 '15 at 12:25

5 Answers5

5

If A has length n, you can find the indices in A of the first occurrence of each entry and remove them from A:

A = [1 2 3 4 3 5 6 7 8 7];
n=length(A);
[~,IA,~] = unique(A);
out = unique(A(setdiff((1:n),IA)))
Steve
  • 1,579
  • 10
  • 23
2

One approach with unique and histc -

[unqA,~,id] = unique(A);
out = unqA(histc(id,1:max(id))>1)

Or use accumarray in place of histc -

out = unqA(accumarray(id(:),1)>1)

Or use bsxfun -

out = unqA(sum(bsxfun(@eq,id(:),1:max(id(:)).'))>1)

Sample runs -

1) Numeric arrays case -

>> A
A =
     6     3     7     7     4     3     8     5     2     3     1
>> [unqA,~,id] = unique(A);
>> unqA(histc(id,1:max(id))>1)
ans =
     3     7
>> unqA(accumarray(id(:),1)>1)
ans =
     3     7
>> unqA(sum(bsxfun(@eq,id(:),1:max(id(:)).'))>1)
ans =
     3     7

2) Cell arrays case -

>> A = {'apple','banana','apple','mango','ball','cat','banana','apple'};

>> [unqA,~,id] = unique(A);
>> unqA(histc(id,1:max(id))>1)
ans = 
    'apple'    'banana'
>> unqA(accumarray(id(:),1)>1)
ans = 
    'apple'    'banana'
>> unqA(sum(bsxfun(@eq,id(:),1:max(id(:)).'))>1)
ans = 
    'apple'    'banana'
Divakar
  • 218,885
  • 19
  • 262
  • 358
2
x=[1 2 3 4 3 5 6 7 8 7];
y=x;
[~,ind,~]=unique(y);
y(ind)=[];

y is the non-unique entries.

Mirza
  • 31
  • 4
  • Although this code may be help to solve the problem, providing additional context regarding _why_ and/or _how_ it answers the question would significantly improve its long-term value. Please [edit] your answer to add some explanation. – oɔɯǝɹ Jul 20 '16 at 19:16
0

Since you ask for a more generic solution, here is one that should be easily adaptable to other data types. Compared to others it is also a O(n) solution - but the drawback is the slow Matlab looping on large array of elements...

A = [1 2 3 4 3 5 6 7 8 7];
dupes = [];

map = containers.Map('KeyType', class(A), 'ValueType' , 'logical');
for i=1:numel(A)
    if map.isKey(A(i))
        dupes = [dupes A(i)];
    else
        map(i) = true;
    end        
end
gregswiss
  • 1,456
  • 9
  • 20
-1

Another approach using sort and diff:

As = sort(A);
out = unique(As([false diff(As)==0]));
Steve
  • 1,579
  • 10
  • 23
Bentoy13
  • 4,886
  • 1
  • 20
  • 33
  • 3
    It seems more than two occurrences of a number would give duplicated entries in the output, unless I missed something. – Divakar Sep 23 '15 at 10:36
  • @Divakar No, you don't miss anything, you're absolutely right. Thanks Steve to have edited my answer. – Bentoy13 Sep 23 '15 at 10:58