4

I have a vector v of size (m,1) whose elements are integers picked from 1:n. I want to create a matrix M of size (m,n) whose elements M(i,j) are 1 if v(i) = j, and are 0 otherwise. I do not want to use loops, and would like to implement this as a simple vector-matrix manipulation only.

So I thought first, to create a matrix with repeated elements

 M = v * ones(1,n) % this is a (m,n) matrix of repeated v

For example v=[1,1,3,2]' m = 4 and n = 3

M =
     1     1     1
     1     1     1
     3     3     3
     2     2     2

then I need to create a comparison vector c of size (1,n)

c = 1:n
1 2 3

Then I need to perform a series of logical comparisons

M(1,:)==c % this results in [1,0,0]
.
M(4,:)==c % this results in [0,1,0]

However, I thought it should be possible to perform the last steps of going through each single row in compact matrix notation, but I'm stumped and not knowledgeable enough about indexing. The end result should be

M =
     1     0     0
     1     0     0
     0     0     1
     0     1     0
rayryeng
  • 102,964
  • 22
  • 184
  • 193
gciriani
  • 611
  • 2
  • 7
  • 19

3 Answers3

5

A very simple call to bsxfun will do the trick:

>> n = 3;
>> v = [1,1,3,2].';
>> M = bsxfun(@eq, v, 1:n)

M =

     1     0     0
     1     0     0
     0     0     1
     0     1     0

How the code works is actually quite simple. bsxfun is what is known as the Binary Singleton EXpansion function. What this does is that you provide two arrays / matrices of any size, as long as they are broadcastable. This means that they need to be able to expand in size so that both of them equal in size. In this case, v is your vector of interest and is the first parameter - note that it's transposed. The second parameter is a vector from 1 up to n. What will happen now is the column vector v gets replicated / expands for as many values as there are n and the second vector gets replicated for as many rows as there are in v. We then do an eq / equals operator between these two arrays. This expanded matrix in effect has all 1s in the first column, all 2s in the second column, up until n. By doing an eq between these two matrices, you are in effect determining which values in v are equal to the respective column index.


Here is a detailed time test and breakdown of each function. I placed each implementation into a separate function and I also let n=max(v) so that Luis's first code will work. I used timeit to time each function:

function timing_binary

n = 10000;
v = randi(1000,n,1);
m = numel(v);

    function luis_func()
    M1 = full(sparse(1:m,v,1));       
    end

    function luis_func2()
    %m = numel(v);
    %n = 3; %// or compute n automatically as n = max(v);
    M2 = zeros(m, n);
    M2((1:m).' + (v-1)*m) = 1;      
    end

    function ray_func()
    M3 = bsxfun(@eq, v, 1:n);
    end

    function op_func()
    M4= ones(1,m)'*[1:n] == v * ones(1,n);
    end

t1 = timeit(@luis_func);
t2 = timeit(@luis_func2);
t3 = timeit(@ray_func);
t4 = timeit(@op_func);

fprintf('Luis Mendo - Sparse: %f\n', t1);
fprintf('Luis Mendo - Indexing: %f\n', t2);
fprintf('rayryeng - bsxfun: %f\n', t3);
fprintf('OP: %f\n', t4);


end

This test assumes n = 10000 and the vector v is a 10000 x 1 vector of randomly distributed integers from 1 up to 1000. BTW, I had to modify Luis's second function so that the indexing will work as the addition requires vectors of compatible dimensions.

Running this code, we get:

>> timing_binary
Luis Mendo - Sparse: 0.015086
Luis Mendo - Indexing: 0.327993
rayryeng - bsxfun: 0.040672
OP: 0.841827

Luis Mendo's sparse code wins (as I expected), followed by bsxfun, followed by indexing and followed by your proposed approach using matrix operations. The timings are in seconds.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
  • oh shoot I was writing this exact same answer haha I definitely type too slow! – Benoit_11 Sep 09 '15 at 18:04
  • @Benoit_11 - lol sorry :) – rayryeng Sep 09 '15 at 18:10
  • @LuisMendo - Thanks! :D I had to change it slightly because I read the question too fast. – rayryeng Sep 09 '15 at 18:10
  • I don't think you need the dot operator in front of the transpose sign. – gciriani Sep 09 '15 at 19:18
  • 1
    @LuisMendo would disagree with you. The `.'` is to perform a transpose regardless of the array being real or complex. Doing `'` by itself will perform the **complex** transpose. I use the dot because I explicitly want to make this clear. It's more of a stylistic choice though. In your case, you have no imaginary components so you can use either or, but I'm used to using `.'`. Not using it has bitten me in the @$$ a couple of times... and so I'm just used to using it. – rayryeng Sep 09 '15 at 19:19
  • I'll run some timing tests and I'll see which is faster. – rayryeng Sep 09 '15 at 19:20
  • @gciriani - Your method is the slowest. I'll update my post. – rayryeng Sep 09 '15 at 19:29
  • 1
    @gciriani I always advise: use `.'` when you just want to transpose! Here are two some examples of the mistakes caused by using `'` when you mean `.'`: http://stackoverflow.com/questions/22258444/transfer-function-in-time-domain http://stackoverflow.com/questions/23509241/whats-the-difference-between-two-way-to-input-matlab-complex-matrices – Luis Mendo Sep 09 '15 at 20:31
  • Cool! Thanks. While you were writing this I tried to with a stopwatch to run a 1o^6 vector against 10^4 values, and my PC froze: no ctrl-C or anything else could make it stop. I had to physically pull the plug :-D – gciriani Sep 09 '15 at 23:20
3

Assuming n equals max(v), you can use sparse:

v = [1,1,3,2];
M = full(sparse(1:numel(v),v,1));

What sparse does is build a sparse matrix using the first argument as row indices, the second as column indices, and the third as matrix values. This is then converted into a full matrix with full.


Another approach is to define the matrix containing initially zeros and then use linear indexing to fill in the ones:

v = [1,1,3,2];
m = numel(v);
n = 3; %// or compute n automatically as n = max(v);
M = zeros(m, n);
M((1:m) + (v-1)*m) = 1;
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
1

I think I've also found a way to do it, and it would be nice if somebody could tell me which of the methods shown is faster for very large vectors and matrices. The additional method I thought of is the following

M= ones(1,m)'*[1:n] == v * ones(1,n)
gciriani
  • 611
  • 2
  • 7
  • 19