1

I am looking for vectorized matlab function to solve following problem:

I have sorted multiset T = [1 1 1 1 2 2 2 3]

and sorted submultiset V of T (length(V) is always smaller than length(T) )

V = [ 1 1 1 2]

I need to find logical vector

D = [1 1 1 0 1 0 0 0]

where length(D) = length(T) and T(D) = V

michal
  • 239
  • 2
  • 9

1 Answers1

2

For a good performance, my Idea would be to work with the histograms, thus have a smaller data size:

T = [1 1 1 1 2 2 2 3];
V = [ 1 1 1 2];
%Get list of all symbols
E=unique(T);
hT=hist(T,E);
hV=hist(V,E);
rep=[hV;hT-hV];
%Next two lines are taken from this answer http://stackoverflow.com/a/28615814/2732801
R=mod(cumsum(accumarray(cumsum([1; rep(:)]), 1)),2);
R=R(1:end-1);

Instead of the last two lines the matlab function repelem might be used:

R=mod(repelem(1:numel(rep),rep(:)),2);
michal
  • 239
  • 2
  • 9
Daniel
  • 36,610
  • 3
  • 36
  • 69
  • How exactly use the repelem function? – michal Feb 02 '16 at 12:10
  • @michal: My matlab version is to old to have this function, I do not know. Maybe someone else can answer this. – Daniel Feb 02 '16 at 12:11
  • Is there any simple vectorized way how to generalize this solution for list of V = [1 1 1 2; 1 1 2 2; 1 2 2 2], for example? – michal Feb 02 '16 at 12:27
  • Why do you expect it to be faster? It's just the difference if you define the bins using edges or centers. – Daniel Feb 02 '16 at 13:26
  • Regarding your previous commend and a vectorized version for multiple V. For that data representation I have no better Idea than looping. Where does the V come from, maybe you can switch to a representation like`V = [3 1; 2 2; 1 3]`? (Same meaning, just count the numbers instead of repeating them) – Daniel Feb 02 '16 at 15:20
  • histc is built-in function, hist perform many additional steps in hist.m file. – michal Feb 03 '16 at 13:43
  • The overhead should be irrelevant – Daniel Feb 04 '16 at 14:18