4

Let's say we have two matrices

A = [1,2,3;
     2,4,5;
     8,3,5]
B=  [2,3;
     4,5;
     8,5]

How do I perform sediff for each row in A and B respectively without using loops or cellfun, in other words performing setdiff(A(i,:),B(i,:)) for all i. For this example I want to get

[1;
 2;
 3]

I am trying to do this for two very big matrices for my fluid simulator, thus I can't compromise on performance.

UPDATE:

you can assume that the second dimension (number of columns) of the answer will be fixed e.g. the answer will always be some n by m matrix and not some ragged array of different column sizes.

Another Example:

In my case A and B are m by 3 and m by 2 respectively and the answer should be m by 1. A solution for this case will suffice, but a general solution for matrices of size m by n1, m by n2 with answer of m by n3 will be very interesting. another example is

A = [1,2,3,4,5;
     8,4,7,9,6]
B = [2,3;
     4,9]

And the answer is

C = [1,4,5;
     8,7,6]
Divakar
  • 218,885
  • 19
  • 262
  • 358
DontCareBear
  • 825
  • 2
  • 11
  • 25
  • 1
    Number of elements for each such run in a row might have different number of elements unless you have some constraints that puts the number of such *setdfiffed* elements to be the same for each row run. So, how do you plan to store such ragged arrays? – Divakar Sep 10 '15 at 06:21
  • 1
    I agree with Divakar, cellfun would be the way to go, as your output is presumbly a cell-array. – Robert Seifert Sep 10 '15 at 06:22
  • I will fix my question. in my case the second dimension of the answer is guaranteed to be fixed for all rows. in any other case you are both correct. – DontCareBear Sep 10 '15 at 06:25
  • Will the second dimension always be one? Can you please post another example where it isn't? And does it contain zeros? – Robert Seifert Sep 10 '15 at 06:30
  • the second dimension won't contain zeros because I am working on matrices of indices (starting from 1 to m). – DontCareBear Sep 10 '15 at 06:39

1 Answers1

6

Approach #1 Using bsxfun -

mask = all(bsxfun(@ne,A,permute(B,[1 3 2])),3);
At = A.'; %//'
out = reshape(At(mask.'),[],size(A,1)).'

Sample run -

>> A
A =
     1     2     3     4     5
     8     4     7     9     6
>> B
B =
     2     3
     4     9
>> mask = all(bsxfun(@ne,A,permute(B,[1 3 2])),3);
>> At = A.'; %//'
>> out = reshape(At(mask.'),[],size(A,1)).'
out =
     1     4     5
     8     7     6

Approach #2 Using diff and sort -

sAB = sort([A B],2)
dsAB = diff(sAB,[],2)~=0

mask1 = [true(size(A,1),1) dsAB]
mask2 = [dsAB true(size(A,1),1)]

mask = mask1 & mask2
sABt = sAB.'

out = reshape(sABt(mask.'),[],size(A,1)).'
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • is bsxfun faster then a loop? because cellfun for example is only faster when using predefined functions with it. – DontCareBear Sep 10 '15 at 06:45
  • 2
    @DontCareBear It would depend a lot on the available system RAM and the dimensions of inputs involved. Best way would be to test out. But since we are dealing with relational operators, I would think it would be performance boost for sure, unless your inputs stretch bsxfun to the limits. – Divakar Sep 10 '15 at 06:47
  • is their any gain using A.' instead of A' (except syntactically)? – DontCareBear Sep 10 '15 at 06:52
  • @DontCareBear Nope, it's just to make sure we are transposing and not getting complex conjugates. – Divakar Sep 10 '15 at 06:52
  • 1
    @DontCareBear See [`Comparing BSXFUN and REPMAT`](http://stackoverflow.com/a/29719681/3293881) for some benchmarks on bsxfun relational operators. – Divakar Sep 10 '15 at 06:53
  • Do you think it's possible to do it with only matrix operation? assuming unique elements per row. – DontCareBear Sep 10 '15 at 06:59
  • @DontCareBear Let me ask you - Do they always have unique elements per row? – Divakar Sep 10 '15 at 07:00
  • yes you can definitely assume that. in my code the first matrix is a matrix of face endpoint indices (of a 2d mesh) and the second matrix is edge endpoint indices. – DontCareBear Sep 10 '15 at 07:05
  • @DontCareBear Also, does the order of elements in each row in the expected output matter? – Divakar Sep 10 '15 at 07:17
  • 1
    @DontCareBear Check out the new added approach? – Divakar Sep 10 '15 at 07:23
  • The first one is faster. Nevertheless thank you very much. the two answers helped me a lot. Amazing answers! – DontCareBear Sep 10 '15 at 07:32
  • @DontCareBear never forget *bsxfun is always the fastest!* ;) – Robert Seifert Sep 10 '15 at 07:32
  • @thewaywewalk Not suprised for sure I bet! ;) – Divakar Sep 10 '15 at 07:33
  • @DontCareBear Just curious, about the speedups you are getting with these solutions over your original setdiff based approach? – Divakar Sep 10 '15 at 07:34
  • my first method was just using for loop with setdiff which can't be good :P I run each of the methods with tic toc and that's what I got: Elapsed time is 3.137423 seconds. lousy for loop Elapsed time is 0.002065 seconds. your first method Elapsed time is 0.002508 seconds. your second method – DontCareBear Sep 10 '15 at 07:43
  • 1
    @DontCareBear Hmm, nice!! Thanks for getting those numbers! Well, bsxfun FTW! ;) – Divakar Sep 10 '15 at 07:44