5

I am trying to shuffle a pandas dataframe by row instead of column.

I have the following dataframe:

   row1    row2    row3
1    3      1       6
2    5      2       7
3    7      3       8 
4    9      4       9

And would like to shuffle the df to achieve a random permutation such as:

   row1    row2    row3
1    6      3       1
2    3      9       2
3    7      5       8 
4    4      9       7

I tried:

df1 = df.reindex(np.random.permutation(df.index))

however, this permutes only by column and not row.

Joey
  • 914
  • 4
  • 16
  • 37
  • 1
    Not sure I understand - If its per row, shouldn't the elems in every row stay there, but simply shuffled? – Divakar Dec 21 '17 at 11:05

1 Answers1

9

You can achieve this by using the sample method and apply it to axis # 1. This will shuffle the elements in a row:

df = df.sample(frac=1, axis=1).reset_index(drop=True)

How ever your desired dataframe looks completely randomised, which can be done by shuffling by row and then by column:

df = df.sample(frac=1, axis=1).sample(frac=1).reset_index(drop=True)

Edit:

import numpy as np
df = df.apply(np.random.permutation, axis=1)    
Jan Zeiseweis
  • 3,718
  • 2
  • 17
  • 24
  • This only shuffles the values in each column and just moves the values to a different position in the dataframe. I want each row to be shuffled so that the sum of each column is not equal to the original sum. Hope this makes sense? – Joey Dec 21 '17 at 11:40
  • You are right! I edited my answer! – Jan Zeiseweis Dec 21 '17 at 11:54