Use df.sample()
to get samples of data frm a dataframe:
a = np.arange(1,81).reshape(8,10)
df = pd.DataFrame(a)
df1= df.sample(frac=.25)
>>df1
0 1 2 3 4 5 6 7 8 9
5 51 52 53 54 55 56 57 58 59 60
3 31 32 33 34 35 36 37 38 39 40
For a random permutation np.random.permutation()
:
df.iloc[np.random.permutation(len(df))].head(2)
0 1 2 3 4 5 6 7 8 9
6 61 62 63 64 65 66 67 68 69 70
1 11 12 13 14 15 16 17 18 19 20
EDIT : To get 20 elements in a list use:
import itertools
list(itertools.chain.from_iterable(df.sample(frac=.25).values))
#[71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
frac=.25
means 25%
of the data, since you have used 80
elements 25%
gives you 20
elements, you can adjust the fraction depending on you many elements you have and how many you want.
EDIT1: Further to your edit in the question: print(df.values)
gives you an array:
[[ 1 2 3 4 5 6 7 8 9 10]
[11 12 13 14 15 16 17 18 19 20]
[21 22 23 24 25 26 27 28 29 30]
[31 32 33 34 35 36 37 38 39 40]
[41 42 43 44 45 46 47 48 49 50]
[51 52 53 54 55 56 57 58 59 60]
[61 62 63 64 65 66 67 68 69 70]
[71 72 73 74 75 76 77 78 79 80]]
You would require to shuffle this array using np.random.shuffle
, in this case , do it on df.T.values
since you also want to shuffle columns:
np.random.shuffle(df.T.values)
Then do a reshape:
df1 = pd.DataFrame(np.reshape(df.values,(4,20)))
>>df1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 4 3 10 2 8 7 1 5 6 9 14 13 20 12 18 17 11 15 16 19
1 24 23 30 22 28 27 21 25 26 29 34 33 40 32 38 37 31 35 36 39
2 44 43 50 42 48 47 41 45 46 49 54 53 60 52 58 57 51 55 56 59
3 64 63 70 62 68 67 61 65 66 69 74 73 80 72 78 77 71 75 76 79