1

How do I return only the rows of a matrix 'A' that not contain some values (These values ​​are an array 'B')?

  A = {'A1',  5  'P01,P02,P03,P04'; 
        'A2'  7,  'P08,P09';
        'A3'  8,  'P07';
        'A4'  8,  'P10,P11'};

    B = { 'P07'; 'P10'; 'P11'};

I need to return only:

'A1'
'A2'

Thanks in advance for your help

TimeIsNear
  • 711
  • 4
  • 19
  • 38
  • 1
    Is it safe to assume that: codes always start with P, always have 2 digits, always are increasing, and never 'skip' numbers? If so and if required you could probably get the best performance by turning `B` into a string and using `strfind` on it. – Dennis Jaheruddin Jan 09 '14 at 14:22
  • Your edit completely changes the question! Sorry, I'm voting -1 – Luis Mendo Jan 09 '14 at 15:06
  • Sorry Luis, I will reedit this question as start and will create a new one. – TimeIsNear Jan 09 '14 at 15:19
  • @TimeIsNear I think that's a better approach. +1 back – Luis Mendo Jan 09 '14 at 15:23
  • @Luis, thanks I've create a new question here: http://stackoverflow.com/questions/21024280/matlab-return-only-the-rows-of-a-matrix-a-that-not-contain-some-values-of-ma?noredirect=1#comment31602989_21024280 – TimeIsNear Jan 09 '14 at 15:57

3 Answers3

3

To remove rows of A which contain at least one of the strings in B

Fancy one-liner with two nested cellfuns and strfind at its core:

A(all(cell2mat(cellfun(@(b) cellfun(@isempty, strfind(A(:,end),b)).',B, 'uni', 0))),1)

Perhaps the logical indices computed as intermediate result are of interest:

ind = cell2mat(cellfun(@(b) cellfun(@isempty, strfind(A(:,end),b)).',B, 'uni', 0));
A(all(ind),1)

Namely, ~ind tells you which strings of B are contained in which rows of A . In the example,

>> ~ind
ans =
     0     0     1     0
     0     0     0     1
     0     0     0     1

How it works: strfind tests if each string of B is in A, and returns a vector with the corresponding positions. So an empty vector means the string is not present. If that vector is empty for all strings of B, that row of A should be selected.

Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
2

Variations on Luis' theme:

ind = A( all(cellfun('isempty', ...
    cellfun(@strfind, ...
        repmat(A(:,end), 1,size(B,1)), ...
        repmat(B', size(A,1),1), 'UniformOutput', false)), 2), 1)

Somewhat against my own expectations, this is a LOT faster than Luis' solution. I think it is primarily due to the string function vs. anonymous function (cellfun is a lot faster with string functions than with anonymous functions). cell2mat not being built-in is also a factor.

Rody Oldenhuis
  • 37,726
  • 7
  • 50
  • 96
1

I suggest you change the way you store the data in A as follows:

A = {'A1',  5,  {'P01','P02','P03','P04'}; 
     'A2',  7,  {'P08','P09'};
     'A3',  8,  {'P07'};
     'A4',  8,  {'P10','P11'}};

B = {'P07'; 'P10'; 'P11'};

Then you can do:

for n = 1:size(A,1)
    ind(n) = ~sum(ismember(B,A{n,3}));
end

A(ind,1)

Or if you prefer a one liner then:

A(cellfun(@(x)(~sum(ismember(B,x))), A(:,3)),1)
Dan
  • 45,079
  • 17
  • 88
  • 157
  • 1
    Unless you change the way data is stored as you say, `ismember` doesn't work because it only tests for _exact_ matching. That's why I use `strfind` with another `cellfun` – Luis Mendo Jan 09 '14 at 12:27
  • @LuisMendo of course not, that's exactly why I said if you change the data. – Dan Jan 09 '14 at 12:41