1

In this answer: https://stackoverflow.com/a/11503439/651779 it is shown how to shuffle a dataframe row- and column wise. I am interested in shuffeling column wise. From the original dataframe

> df1
  a b c
1 1 1 0
2 1 0 0
3 0 1 0
4 0 0 0

shuffling column wise

> df3=df1[,sample(ncol(df1))]
> df3
  c a b
1 0 1 1
2 0 1 0
3 0 0 1
4 0 0 0

However, I would like to shuffle each row on its columns independent of the other rows, instead of shuffling the complete columns, so that you could get something like

>df4
  c a b
1 0 1 1
2 1 0 0
3 1 0 0
4 0 0 0

Now I do that by looping over each row, shuffling the row, and putting it in a dataframe. Is there an easy way to do this?

Community
  • 1
  • 1
Niek de Klein
  • 8,524
  • 20
  • 72
  • 143

1 Answers1

2

something like:

t(apply(df1, 1, function(x) { sample(x, length(x)) } ))

This will give you the result in matrix form. If you have factors, a mix of numeric and characters etc, be aware that this will coerce everything to character.

user1317221_G
  • 15,087
  • 3
  • 52
  • 78
  • 1
    I think `t(apply(df1, 1, sample))` should do it. – A5C1D2H2I1M1N2O1R2T1 Apr 15 '14 at 12:54
  • Regarding your edit, even if they have a mixture of column types to begin with, this would be the only possible way unless they want to shuffle *within* the column types, wouldn't it? – A5C1D2H2I1M1N2O1R2T1 Apr 15 '14 at 12:57
  • @AnandaMahto great improvement thatnks! You are totally correct, if he is shifting between columns then he will mix the classes anyway, but I suppose, if the OP *just* had factors for example, I wanted to let them know about coercing to a new class. – user1317221_G Apr 15 '14 at 13:00
  • That works, thanks! Howver, it does remove the column names. Is there a way to keep them, or should I re-add them after sampling? – Niek de Klein Apr 15 '14 at 13:19
  • great glad it helped. I think it is best just to add them afterwards – user1317221_G Apr 15 '14 at 13:36