3

I have a file full of data in columns

sarah mark john
10    20   5
x     y    z 

I want to sort the data so the columns stay intact but the second row is in increasing order so it looks like this:

john sarah mark
5    10    20 
z    x     y 

I've been looking at the sort command but have only been able to find vertical sorting, not horizontal. I'm happy to use any tool, any help is appreciated. Thank you!

sma
  • 31
  • 1
  • 3

3 Answers3

8

Let's create a function to transpose a file (make rows become columns, and columns become rows):

transpose () {
  awk '{for (i=1; i<=NF; i++) a[i,NR]=$i; max=(max<NF?NF:max)}
        END {for (i=1; i<=max; i++)
              {for (j=1; j<=NR; j++) 
                  printf "%s%s", a[i,j], (j<NR?OFS:ORS)
              }
        }'
}

This just loads all the data into a bidimensional array a[line,column] and then prints it back as a[column,line], so that it transposes the given input. The wrapper transpose () { } is used to store it as a bash function. You just need to copy paste it in your shell (or in ~/.bashrc if you want it to be a permanent function, available any time you open a session).

Then, by using it, we can easily solve the problem by using sort -n -k2: sort numerically based on column 2. Then, transpose back.

$ cat a | transpose | sort -n -k2 | transpose
john sarah mark
5 10 20
z x y

In case you want to have a nice format as final output, just pipe to column like this:

$ cat a | transpose | sort -n -k2 | transpose | column -t
john  sarah  mark
5     10     20
z     x      y

Step by step:

$ cat a | transpose 
sarah 10 x
mark 20 y
john 5 z
$ cat a | transpose | sort -n -k2
john 5 z
sarah 10 x
mark 20 y
$ cat a | transpose | sort -n -k2 | transpose 
john sarah mark
5 10 20
z x y
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • This works really well (thank you) but it seems computationally very expensive on a large file. If I want to do it more quickly am I stuck with reading it into my own data structure and sorting on it there? – sma Aug 08 '14 at 13:23
1

Coming from a duplicate question, this would sort the columns by the first row:

#!/bin/bash

input="$1"

order=$((for i in $(head -1 $input); do echo $i; done) | nl | sort -k2 | cut -f1)

grep ^ $input | (while read line
  do
    read -a columns <<< "${line%"${line##*[![:space:]]}"}"

    orderedline=()
    for i in ${order[@]}
    do
      orderedline+=("${columns[$i - 1]}")
    done
    line=$(printf "\t%s" "${orderedline[@]}")
    echo ${line:1}
  done)

To sort by second row, replace head -1 $input with head -2 $input | tail -1. If the sort should be numeric, put in sort -n -k2 instead of sort -k2.

Community
  • 1
  • 1
Amadan
  • 191,408
  • 23
  • 240
  • 301
1

Good one-liner gets the job done:

perl -ane '$,=" "; print sort @F; print "\n";' file

I found it here: http://www.unix.com/unix-for-advanced-and-expert-users/36039-horizontal-sorting-lines-file-sed-implementation.html

  • This sorts each line individually, rather than moving the data of the columns so that the first line is sorted. – tripleee Jul 01 '15 at 06:53