I want to create a script that randomly shuffles the rows and columns of a large csv file. For example, for a initial file f.csv:
a, b, c ,d
e, f, g, h
i, j, k, l
First, we shuffle the rows to obtain f1.csv:
e, f, g, h
a, b, c ,d
i, j, k, l
Then, we shuffle the columns f2.csv:
g, e, h, f
c, a, d, b
k, i, l, j
In order to shuffle the rows, we can use from here:
awk 'BEGIN{srand() }
{ lines[++d]=$0 }
END{
while (1){
if (e==d) {break}
RANDOM = int(1 + rand() * d)
if ( RANDOM in lines ){
print lines[RANDOM]
delete lines[RANDOM]
++e
}
}
}' f.csv > f1.csv
But, how to shuffle the columns?