I have a row of data, say A = [0 1 1 1 0 0]
.
Matrix B contains many rows. For a dummy example let's say it's just B = [1 1 1 0 1 0; 1 0 0 1 0 1]
.
I want to find the number of columns in which A and a row of B differ, and use that vector of differences to find which row of B is most similar to A. So for the example above, A differs from B(1,:)
in columns 1, 4, 5 = 3 total difference. A differs from B(2,:) in columns 1, 2, 3, 6 = 4 total differences, and so I would want to return index 1 to indicate that A is most similar to B(1,:).
In reality B has ~50,000 rows, and A and B both have about 800 columns. My current code to find the most similar row is below:
min(sum(xor(repmat(A,B_rows,1),B),2));
That works, but it's very slow. Any insights into which function is taking so long and ways to improve it?