3

I have a dataframe like so:

> df1
  a b c
1 0.5 0.3 0
2 0.2 0 0
3 0 0.6 0
4 0 0 0.4

I would like to permute the rows within each column with replacement 1000 times, however I would like to do this independently for each column (like a slot machine in Las Vegas).

I noticed that the sample function in R doesn't really allow this, for example sampling rowwise gives you.

> df2 <- df1[sample(nrow(df1)),]
> df2
  a b c
3 0 0.6 0
4 0 0 0.4
2 0.2 0 0
1 0.5 0.3 0

But notice how the whole row is taken as a chunk (i.e they are kept beside their columns e.g 0.5 is always beside 0.3)

I don't think doing this both column-wise and row-wise is the correct answer because then it is permuting horizontally and vertically (i.e not like a slot machine in Vegas).

user3816990
  • 247
  • 1
  • 2
  • 10

2 Answers2

3

Here's one way:

df2 <- df1
n   <- nrow(df1)

set.seed(1)
df2[] <- lapply(df1,function(x) x[sample.int(n)] )
#     a   b   c
# 1 0.2 0.3 0.0
# 2 0.0 0.6 0.0
# 3 0.0 0.0 0.4
# 4 0.5 0.0 0.0

Or just lapply(df1,sample) as @akrun said.

Frank
  • 66,179
  • 8
  • 96
  • 180
0

The answer options above return a list, which may be fine for your purposes. Here's another option:

set.seed(1)
matrix(sample((unlist(df1))), ncol = 3, dimnames = (list(NULL, letters[1:3])))

       a   b   c
[1,] 0.0 0.2 0.0
[2,] 0.3 0.6 0.5
[3,] 0.0 0.0 0.0
[4,] 0.0 0.4 0.0
Chase
  • 67,710
  • 18
  • 144
  • 161
  • 1
    This does not permute vectors: a=(0,.3,0,0) is not a permutation of a=(.5,.2,0,0) – Frank May 09 '15 at 17:52
  • Also, while `lapply` returns a list, if you assign it to a `data.frame`, like `df[] <- some_list`, it's all good. – Frank May 09 '15 at 17:54
  • 1
    @frank - I guess I misread the requirement of needing to preserve the column-ness of the permutations. In that case, your solution is most appropriate. My solution essentially samples all values and formats back into the original dimensions. – Chase May 10 '15 at 23:07