1

How can I construct a scrambled matrix with 128 rows and 32 columns in vb.net or Matlab?

Entries of the matrix are numbers between 1 and 32 with the condition that each row mustn't contain duplicate elements and rows mustn't be duplicates.

Robert Seifert
  • 25,078
  • 11
  • 68
  • 113

3 Answers3

4

This is similar to @thewaywewalk's answer, but makes sure that the matrix has no repeated rows by testing if it does and in that case generating a new matrix:

done = 0;
while ~done
    [~, matrix] = sort(rand(128,32),2);
    %// generate each row as a random permutation, independently of other rows.
    %// This line was inspired by randperm code
    done = size(unique(matrix,'rows'),1) == 128;
    %// in the event that there are repeated rows: generate matrix again 
end

If my computations are correct, the probability that the matrix has repteated rows (and thus has to be generated again) is less than

>> 128*127/factorial(32)
ans =
  6.1779e-032

Hey, it's more likely that a cosmic ray will spoil a given run of the program! So I guess you can safely remove the while loop :-)

Community
  • 1
  • 1
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
  • 1
    Surely for this particular case a for loop with exactly 128 iterations and verification test afterwards is better. But isn't your solution very inefficient, in case the probability for repeated rows is higher? Because you're generating a completely new matrix everytime when just one row is repeated. For my opinion it doesn't hurt to generate some more rows as buffer, but it does hurt to generate a complete new matrix multiple times. – Robert Seifert Jan 08 '14 at 15:09
  • @thewaywewalk Completely agree. I have focused on the particular case represented by the numbers in the OP. It would be better to only regenerate the rows detected as duplicate – Luis Mendo Jan 08 '14 at 15:10
  • 1
    have a look at my last edit, I made some interesting test runs. – Robert Seifert Jan 08 '14 at 15:26
  • 1
    I suppose that in theory one could improve slightly by only not-adding the row it it is member of the existing set, rather than starting over. But for practical purposes I think this is definitely a good solution. – Dennis Jaheruddin Jan 08 '14 at 15:40
  • @DennisJaheruddin Agree. Note, however, that I've replaced the `for` loop that used to generate each row by a vectorized solution, inspired by how `randperm` works. – Luis Mendo Jan 08 '14 at 15:43
  • That's a nice approach, very creative! I run your solution a million times, but with `[~, matrix] = sort(rand(24,4),2)`, so the probability for duplicate rows would be almost 100% for my solution, but for yours it never happened that the matrix was incorrect - do you think that would be still possible? I'd assume this is the perfect approach. You can leave out the testing. – Robert Seifert Jan 08 '14 at 17:02
  • @thewaywewalk With `[~, matrix] = sort(rand(24,4),2)` the probability of duplicate rows is very high. I have generated 1e5 matrices and all of them were incorrect (had duplicate rows). – Luis Mendo Jan 08 '14 at 17:45
  • @thewaywewalk In fact, for the (24,4) case I think the probability of correct matrix is `prod((23:-1:1)/24)` (right?), that is 4.65e-10 – Luis Mendo Jan 08 '14 at 17:51
  • I'm not that into statistics ;) but If you say so. I would test it with 1e+12 runs then ;) – Robert Seifert Jan 08 '14 at 18:00
  • 1
    @thewaywewalk 23/24 is the probability that second row is different than first. 22/24 is the probability that third row is different than first and second, etc. And you multiply those probabilities as they are independent events. I've tested with 100000 matrices in the (6,3) case and I get 0.0159, which is very close to `prod((5:-1:1)/6)` = 0.0154. So I think it's correct – Luis Mendo Jan 08 '14 at 18:03
2

With randperm you can generate one row:

row = randperm(32)

if this vector wouldn't be that long you could just use perms to find all permutations:

B = perms(randperm(32))

but it's memory-wise too much! ( 32! = 2.6313e+35 rows )


so you can use a little loop:

N = 200;    

A = zeros(N,32);
for ii = 1:N
    A(ii,:) = randperm(32);
end

B = unique(A, 'rows');
B = B(1:128,:);

For my tests it was sufficient to use N = 128 directly and skip the last two lines, because with 2.6313e+35 possibly permutations the probability that you get a correct matrix with the first try is very high. But to be sure that there are no row-duplicates choose a higher number and select the first 128 rows finally. In case the input vector is relatively short and the number of desired rows close to the total number of possible permutations use the proposed perms(randperm( n )).


small example for intergers from 1 to 4 and a selection of 10 out of 24 possible permutations:

N = 20;    
A = zeros(N,4);
for ii = 1:N
    A(ii,:) = randperm(4);
end
B = unique(A, 'rows');
B = B(1:10,:);

returns:

B =

     1     2     3     4
     1     2     4     3
     1     3     4     2
     2     3     1     4
     2     3     4     1
     2     4     1     3
     2     4     3     1
     3     1     2     4
     3     1     4     2
     3     2     1     4

some additional remarks for the choice of N:

I made some test runs, where I used the loop above to find all permutations like perms does. For vector lengths of n=4 to n=7 and in each case N = factorial(n): 60-80% of the rows are unique. So for small n I would recommend to choose N as follows to be absolutely on the safe side:

N = min( [Q factorial(n)] )*2;

where Q is the number of permutations you want. For bigger n you either run out of memory while searching for all permutations, or the desired subset is so small compared to the number of all possible permutations that repetition is very unlikely! (Cosmic Ray theory linked by Luis Mendo)

Robert Seifert
  • 25,078
  • 11
  • 68
  • 113
0

Your requirements are very loose and allow many different possibilities. The most efficient solution I can think off that meets these requirements is as follows:

p = perms(1:6);
[p(1:128,:) repmat(7:32,128,1)]
Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
  • `perms(6)` does not really make sense, probably you mean `perms(1:6)` - and also your generated matrix is everything but random, from all 2.6313e+35 possible permutations your subset is the worst distributed possible. If the distribution does not matter, its efficient, that's right. – Robert Seifert Jan 08 '14 at 15:33
  • @thewaywewalk I corrected the call to perms. But indeed for a uniformlly random distributed solution the answer by Luis seems to be the best bet. – Dennis Jaheruddin Jan 08 '14 at 15:35