92

I have some files in linux. For example 2 and i need shuffling the files in one file.

For example

$cat file1
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8

and

$cat file2
linea one
linea two
linea three
linea four
linea five
linea six
linea seven
linea eight

And later that i shuffling the two files i can obtain something like:

linea eight
line 4
linea five
line 1
linea three
line 8
linea seven
line 5
linea two
linea one
line 2
linea four
line 7
linea six
line 1
line 6
Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
Code Geas Coder
  • 1,839
  • 4
  • 23
  • 29
  • 4
    possible duplicate of [How can I shuffle the lines of a text file in Unix command line?](http://stackoverflow.com/questions/2153882/how-can-i-shuffle-the-lines-of-a-text-file-in-unix-command-line) – jfs Oct 27 '14 at 08:04

8 Answers8

157

You should use shuf command =)

cat file1 file2 | shuf

Or with Perl :

cat file1 file2 | perl -MList::Util=shuffle -wne 'print shuffle <>;'
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
61

Sort: (similar lines will be put together)

cat file1 file2 | sort -R

Shuf:

cat file1 file2 | shuf

Perl:

cat file1 file2 | perl -MList::Util=shuffle -e 'print shuffle<STDIN>'

BASH:

cat file1 file2 | while IFS= read -r line
do
    printf "%06d %s\n" $RANDOM "$line"
done | sort -n | cut -c8-

Awk:

cat file1 file2 | awk 'BEGIN{srand()}{printf "%06d %s\n", rand()*1000000, $0;}' | sort -n | cut -c8-
Ulysse BN
  • 10,116
  • 7
  • 54
  • 82
clt60
  • 62,119
  • 17
  • 107
  • 194
  • 7
    Please don't use `sort -R` for this task unless you're sure all the lines are distinct (see my comment for Kent's answer). – gniourf_gniourf Jul 10 '13 at 21:38
  • @gniourf_gniourf of course - theyre sorted by the hash key - same line = same hash key... – clt60 Jul 10 '13 at 21:44
  • 3
    Hello down voter. Would be nice to know how to improve the answer. – clt60 Mar 06 '15 at 11:52
  • 1
    `shuf` performs best, followed by the `perl` solution; the `awk` solution, while noticeably slower, has the advantage of being POSIX-compliant. `sort -R`, as mentioned, is not a true shuffle, and also quite slow with large input. Using a `bash` loop is slowest by far, and here also hampered by only producing a max. of 32,768 random values. See [here](http://stackoverflow.com/a/30133294/45375) for a more detailed performance comparison. – mklement0 May 12 '15 at 12:28
  • @gniourf_gniourf so `sort -Ru` x) – Fuseteam Oct 27 '20 at 12:28
25

Just a note to OS X users who use MacPorts: the shuf command is part of coreutils and is installed under name gshuf.

$ sudo port install coreutils
$ gshuf example.txt # or cat example.txt | gshuf
Messa
  • 24,321
  • 6
  • 68
  • 92
19

You don't need to use pipes here. Sort alone does this with the file(s) as parameters. I would just do

sort -R file1

or if you have multiple files

sort -R file1 file2
davvs
  • 1,029
  • 1
  • 11
  • 18
13

Here's a one-liner that doesn't rely on shuf or sort -R, which I didn't have on my mac:

while read line; do echo $RANDOM $line; done < my_file | sort -n | cut -f2- -d' '

This iterates over all the lines in my_file and reprints them in a randomized order.

Tyler
  • 28,498
  • 11
  • 90
  • 106
8

I would use shuf too.

another option, gnu sort has:

   -R, --random-sort
          sort by random hash of keys

you could try:

cat file1 file2|sort -R
Kent
  • 189,393
  • 32
  • 233
  • 301
1

This worked for me. It employs the Fisher-Yates shuffle.

randomize()
{   
    arguments=("$@")
    declare -a out
    i="$#"
    j="0"

while [[ $i -ge "0" ]] ; do
    which=$(random_range "0" "$i")
    out[j]=${arguments[$which]}
    arguments[!which]=${arguments[i]}
    (( i-- ))
    (( j++ ))
done
echo ${out[*]}
}


random_range()
{
    low=$1
    range=$(($2 - $1))
    if [[ range -ne 0 ]]; then
        echo $(($low+$RANDOM % $range))
    else
        echo "$1"
    fi
}
mmore500
  • 11
  • 1
0

It is clearly biased rand (like half the time the list will start with the first line) but for some basic randomization with just bash builtins I guess it is fine? Just print each line yes/no then print the rest...

shuffle() {
    local IFS=$'\n' tail=
    while read l; do
        if [ $((RANDOM%2)) = 1 ]; then
            echo "$l"
        else
            tail="${tail}\n${l}"

        fi
    done < $1
    printf "${tail}\n"
}
untore
  • 603
  • 8
  • 16