1

I want to create a matrix in R with a set number of variables (e.g. 1 to 10). Those variables should be randomly assigned over rows and columns BUT should not be repeated in either (so number 1 should be once in row 1 and once in column 1)!

So for example:

1,2,3,4,5,6,7,8,9,10

2,3,4,5,6,7,8,9,10,1

3,4,5,6,7,8,9,10,1,2

4,5,6,7,8,9,10,1,2,3

5,6,7,8,9,10,1,2,3,4

6,7,8,9,10,1,2,3,4,5

7,8,9,10,1,2,3,4,5,6

8,9,10,1,2,3,4,5,6,7

9,10,1,2,3,4,5,6,7,8

10,1,2,3,4,5,6,7,8,9

But of course in that example the numbers are ascending and I want them randomized. I tried simple matrix demands but I cannot figure out how to do this. Can anyone help? Thanks in advance!

shampoo
  • 57
  • 6

2 Answers2

1

Unless I'm misunderstanding the problem, there's a much simpler way to create this shuffled matrix, without any loops or complicated conditional statements.

# number of rows and columns
n <- 10

# create ordered rows and columns
ordered.by.row <- matrix(1:n, n, n)
ordered.by.col <- matrix(1:n, n, n, byrow = T)

# offset the rows and columns relative to each other.
# no row or column has a repeated value, but the values are still ordered
offset <- (ordered.by.row + ordered.by.col) %% n + 1

# shuffle the columns, then shuffle the rows, this produces a randomized matrix
# 'shuffle.row' is the final, randomized matrix
set.seed(1222) # change this to change randomization
shuffle.col <- offset[,sample(1:n, n, replace = F)]
shuffle.row <- shuffle.col[sample(1:n, n, replace = F), ]

# verify solution
any(apply(shuffle.row, 1, function(r)any(duplicated(r)))) # FALSE
any(apply(shuffle.row, 2, function(r)any(duplicated(r)))) # FALSE

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    1   10    6    9    2    8    3    5    7     4
 [2,]    3    2    8    1    4   10    5    7    9     6
 [3,]    7    6    2    5    8    4    9    1    3    10
 [4,]    9    8    4    7   10    6    1    3    5     2
 [5,]   10    9    5    8    1    7    2    4    6     3
 [6,]    2    1    7   10    3    9    4    6    8     5
 [7,]    8    7    3    6    9    5   10    2    4     1
 [8,]    6    5    1    4    7    3    8   10    2     9
 [9,]    5    4   10    3    6    2    7    9    1     8
[10,]    4    3    9    2    5    1    6    8   10     7
jdobres
  • 11,339
  • 1
  • 17
  • 37
  • Will shuffling keep the randomization? This is for sure much faster than backtracking. – Fernando Mar 20 '17 at 22:01
  • Yes. Performing the randomization in two separate steps is the key. First we create the "ordered solution", where each digit appears once in each row and column, but all digits are still in order. When we randomize the columns, we still meet the criterion of uniqueness for rows. Then we randomize the rows, and still meet uniqueness for columns. The shuffled row and column indices are randomly chosen, of course, so randomness is preserved. – jdobres Mar 20 '17 at 22:16
  • This is perfect! Thank you very much! – shampoo Mar 21 '17 at 06:41
  • Thank you again for your answer in the past! Now I have a slightly different problem (not sure whether I should open another question..). The difference being: Half of my sample is given beforehand and needs to stay as it is (lets say row 1 to 5). Still I need randomization for the other row (6-10) without any row or column dupliclates. To be more precise: I have 5 rows such as in your example above – shampoo Sep 12 '17 at 09:45
  • [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 10 6 9 2 8 3 5 7 4 [2,] 3 2 8 1 4 10 5 7 9 6 [3,] 7 6 2 5 8 4 9 1 3 10 [4,] 9 8 4 7 10 6 1 3 5 2 [5,] 10 9 5 8 1 7 2 4 6 3 – shampoo Sep 12 '17 at 09:49
  • but now I need those 5 rows to stay as they are and ADD additional 5 rows without duplicates (row and column) in any. I have tried to adjust the offset function but I cannot figure out how to do this (if possible?) without either randomizing the fixed 5 rows as well or not implementing duplcate check for the last 5. Can you help?? Thank you so much in advance!! – shampoo Sep 12 '17 at 09:50
  • In other words: I need randomization only for rows 6-10 BUT no duplicates should be made also in respect to row 1-5 and in the columns! – shampoo Sep 12 '17 at 09:56
0

This seems almost like generating a Sudoku grid. The code below works pretty fast, but some minor R optimizations could be done:

backtrack = function(n = 10){
  x = matrix(NA, ncol = n, nrow = n)
  cells = list()
  k = 1
  for (i in 1:n){
    for (j in 1:n){
      cells[[k]] = sample(1:n)
      k = k + 1
    }
  }

  i = 0
  while (i < n*n){
    candidates = cells[[i + 1]]
    idx = sample(1:length(candidates), 1)
    val = candidates[idx]

    if (length(candidates) == 0){
      cells[[i + 1]] = sample(1:n)
      i = i - 1
      x[as.integer(i/n) + 1,  i %% n + 1] = NA
    }
    else {
      rr = as.integer(i/n) + 1
      cc = i %% n + 1
      if ((val %in% x[rr, ]) || (val %in% x[, cc])){
        candidates = candidates[-idx]
        cells[[i + 1]] = candidates
      }
      else{
        x[as.integer(i/n) + 1, i %% n + 1] = val
        candidates = candidates[-idx]
        cells[[i + 1]] = candidates
        i = i + 1
      }
    }
  }
  x
}

Testing:

set.seed(1) # Please change this
x = backtrack(10)
print(x)

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    8   10    4    6    9    7    1    2    3     5
 [2,]    5    6    9    8    1   10    4    3    2     7
 [3,]   10    7    1    2    8    9    5    4    6     3
 [4,]    3    9    8   10    6    5    7    1    4     2
 [5,]    9    1    6    4    7    3    2    5   10     8
 [6,]    1    4   10    3    2    6    8    7    5     9
 [7,]    2    8    5    9   10    1    3    6    7     4
 [8,]    6    5    2    7    3    4   10    9    8     1
 [9,]    4    3    7    1    5    2    6    8    9    10
[10,]    7    2    3    5    4    8    9   10    1     6


any(apply(x, 1, function(r)any(duplicated(r)))) # FALSE
any(apply(x, 2, function(r)any(duplicated(r)))) # FALSE
Fernando
  • 7,785
  • 6
  • 49
  • 81
  • Wow! Thank you so much, @Fernando!!! This is exactly what I needed (although I´ll probably need a while to really understand what you did ;) )!! – shampoo Mar 20 '17 at 21:40
  • You're welcome. It's a backtracking algoritthm: https://en.wikipedia.org/wiki/Backtracking – Fernando Mar 20 '17 at 21:44
  • You can accept the answer or wait for more people input. Normally you should wait a little bit, but in this case I don't see any of solution other than backtracking. – Fernando Mar 20 '17 at 21:47
  • Hi Fernando! I have an additional problem to the question you answered: how can I adjust the backtracking when I have the first 3 rows fixed already?? I think this would work with backtacking but I can´t manage ... – shampoo Sep 13 '17 at 18:17
  • First rows for example like this: row1<- c(1,4,7,6,5,3,2,8,9,10) row2<- c(10,7,3,2,1,4,5,9,8,6) row3<- c(9,2,4,3,8,7,10,1,6,5) – shampoo Sep 13 '17 at 18:17
  • see https://stackoverflow.com/questions/46186549/constructing-a-randomised-matrix-with-no-duplicates-but-fixed-partial-input – shampoo Sep 13 '17 at 18:23