3

When I run a simple for loop to compute X number of permutations of a vector, the sample() function returns the same permutation for each iteration.

Below is my code:

options <- commandArgs(trailingOnly=T)
labels <- read.table(options[2], header=F)
holder <- c()

for (i in 1:options[1]){

    perm <- sample(labels[,2:ncol(labels)], replace=F)
    perm <- cbind(as.character(labels[1]), perm)
    holder <- rbind(holder, perm)

}

write.table(holder, file=options[3], row.names=F, col.names=F, quote=F, sep='\t')

Is there a reason why this is so? Is there another simple way to generate say 1000 permutations of a vector?

*Added after comment - a replicable example*

vec <- 1:10
holder <-c()
for (i in 1:5){
    perm <- sample(vec, replace=F)
    holder <- rbind(holder, perm)
}

> holder
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
perm    3    2    1   10    9    6    7    4    5     8
perm    5    8    2    3    4   10    9    1    6     7
perm   10    7    3    1    4    2    5    8    9     6
perm    9    5    2    8    3    1    6   10    7     4
perm    3    7    5    6    8    2    1    9   10     4

And this works fine! I guess I have a bug somewhere! My input is perhaps in a mess.

Thanks, D.

Thanks, D.

Darren J. Fitzpatrick
  • 7,159
  • 14
  • 45
  • 49
  • 4
    I don't know how to replicate your results. Care to give me a hand? http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Oct 26 '11 at 17:39
  • In your reproducible example, you are sampling a vector (`vec`); in your original, you are sampling a data frame (result of `read.table`). Two very different things. See my answer (written before I saw your update) for more details. – Brian Diggs Oct 26 '11 at 18:01

1 Answers1

2

For a reproducible example, just replace options[1] with a constant set and labels to a built-in or self-specified data frame. (By the way, neither are great variable names being base functions.) Just looking at the inner part of your for loop, you shuffle all but the first column of a data.frame. This works as you expect. Put print(names(perm)) in after finishing making perm and you will see. You then rbind this data frame to the previous results. rbind, recognizing it is working with data frames, helpfully reshuffles the column order of the different data frames so that the column names line up (which, generally, is what you would want it to do; the name of the column defines which one it is and you would want to extend each column appropriately.)

The problem is that you are doing permutations on columns of a data frame, not "of a vector" as you seem to think.

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
  • I didn't know `rbind` was so 'kind'. That may help explain why it's so painstakingly slow on `data.frame`s. Have a +1 for that! – Nick Sabbe Oct 27 '11 at 07:11