2

Given an ls -l list of directories which are software release versions, how to sort into human-preferable form? Eg:

$ ls -loghF
total 209
drwxr-xr-x   2       3 Jun 18 11:33 12.0.40.0/
drwxr-xr-x   2       3 Aug 24 14:45 13.0.11.10/
drwxr-xr-x   2       3 Jul 13 14:12 13.0.11.4/
drwxr-xr-x   2       3 Jul 26 15:30 13.0.11.5/
drwxr-xr-x   2       4 Jul 27 11:33 13.0.11.6/
drwxr-xr-x   2       3 Aug  3 11:41 13.0.11.7/
drwxr-xr-x   2       3 Aug 10 11:53 13.0.11.8/
drwxr-xr-x   2       3 Aug 17 17:00 13.0.11.9/
drwxr-xr-x   2       3 Aug  3 14:37 13.0.17.0/
drwxr-xr-x   2       3 Aug 13 11:50 13.0.18.0/
drwxr-xr-x   2       3 Aug 17 11:21 13.0.19.0/
drwxr-xr-x   2       3 Jul 28 15:00 13.0.9.1/

The desired result is:

$ ls -loghF | sort ...
total 209
drwxr-xr-x   2       3 Jun 18 11:33 12.0.40.0/
drwxr-xr-x   2       3 Jul 28 15:00 13.0.9.1/
drwxr-xr-x   2       3 Jul 13 14:12 13.0.11.4/
drwxr-xr-x   2       3 Jul 26 15:30 13.0.11.5/
drwxr-xr-x   2       4 Jul 27 11:33 13.0.11.6/
drwxr-xr-x   2       3 Aug  3 11:41 13.0.11.7/
drwxr-xr-x   2       3 Aug 10 11:53 13.0.11.8/
drwxr-xr-x   2       3 Aug 17 17:00 13.0.11.9/
drwxr-xr-x   2       3 Aug 24 14:45 13.0.11.10/
drwxr-xr-x   2       3 Aug  3 14:37 13.0.17.0/
drwxr-xr-x   2       3 Aug 13 11:50 13.0.18.0/
drwxr-xr-x   2       3 Aug 17 11:21 13.0.19.0/

The sort must skip past the date portion of each line, then sort numerically (eg, starting with the 12 or 13), using '.' as a field separator.

I thought of two approaches, but am having difficulty with the sort -k syntax, if it's supported at all:

(1) Skip the first 36 characters, then with '.' as field separator, sort numerically on the next 4 fields.

(2) With field separator as whitespace, skip to the 7th field, then change the field separator to '.' and sort numerically on the next 4 fields.

The alternate is a little Perl script, but can't Unix sort do this "simple" task?

John DB
  • 101
  • 1
  • 8

3 Answers3

2

Here's a command line which uses awk to put the version numbers first, sorts using four numerical keys, then uses cut to get rid of the temporary at front:

$ ls -loghF | awk '{ print $7, $0; }' | sort -n -t. -k1,1 -k2,2 -k3,3 -k4,4 | cut -d' ' -f2-
drwxr-xr-x   2       3 Jun 18 11:33 12.0.40.0/
drwxr-xr-x   2       3 Jul 28 15:00 13.0.9.1/
drwxr-xr-x   2       3 Jul 13 14:12 13.0.11.4/
drwxr-xr-x   2       3 Jul 26 15:30 13.0.11.5/
drwxr-xr-x   2       4 Jul 27 11:33 13.0.11.6/
drwxr-xr-x   2       3 Aug  3 11:41 13.0.11.7/
drwxr-xr-x   2       3 Aug 10 11:53 13.0.11.8/
drwxr-xr-x   2       3 Aug 17 17:00 13.0.11.9/
drwxr-xr-x   2       3 Aug 24 14:45 13.0.11.10/
drwxr-xr-x   2       3 Aug  3 14:37 13.0.17.0/
drwxr-xr-x   2       3 Aug 13 11:50 13.0.18.0/
drwxr-xr-x   2       3 Aug 17 11:21 13.0.19.0/

The sort command there is borrowed from this answer. Another answer suggests sort -V (version sort), but my version of sort doesn't have it (yours might, though, so it's worth trying). Version sort is likely to be specific to newer GNU coreutils (my Linux box has it, and sort is from GNU Coreutils 8.5).

With version sort:

$ ls -loghF | sort -k7,7V
drwxr-xr-x   2       3 Jun 18 11:33 12.0.40.0/
drwxr-xr-x   2       3 Jul 28 15:00 13.0.9.1/
drwxr-xr-x   2       3 Jul 13 14:12 13.0.11.4/
drwxr-xr-x   2       3 Jul 26 15:30 13.0.11.5/
drwxr-xr-x   2       4 Jul 27 11:33 13.0.11.6/
drwxr-xr-x   2       3 Aug  3 11:41 13.0.11.7/
drwxr-xr-x   2       3 Aug 10 11:53 13.0.11.8/
drwxr-xr-x   2       3 Aug 17 17:00 13.0.11.9/
drwxr-xr-x   2       3 Aug 24 14:45 13.0.11.10/
drwxr-xr-x   2       3 Aug  3 14:37 13.0.17.0/
drwxr-xr-x   2       3 Aug 13 11:50 13.0.18.0/
drwxr-xr-x   2       3 Aug 17 11:21 13.0.19.0/
Community
  • 1
  • 1
nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • 1
    Solaris `sort` is rather unlikely to have such options as version-number or 'natural' sorting. – Jonathan Leffler Aug 27 '12 at 05:23
  • 1
    What do you mean by 'natural' sorting? Note that GNU coreutils can be installed on a lot of machines, so it's not impossible to have `sort -V` on Solaris. – nneonneo Aug 27 '12 at 05:25
  • See [tag:natural-sort] for examples. See also [Coding Horror](http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html). Etc. – Jonathan Leffler Aug 27 '12 at 05:27
  • 1
    Of course you can add GNU coreutils to the machine; but you'd never put that lot in `/usr/bin` on Solaris, not if you value your sanity. Don't touch `/bin` or `/usr/bin` on any machine unless you're very confident that every script provided with the system will be happy with the changed semantics of the commands. – Jonathan Leffler Aug 27 '12 at 05:30
  • Thanks folks, but if I have to script it, I'll use Perl - no awk'ing. Can sort handle this directly? It seems the key point is, how to use multiple field separators, or combining a field separator with a numeric field counter. – John DB Aug 27 '12 at 05:31
  • FYI: I have GNU sort available, if it helps. `$ /opt/sfw/bin/sort --version` gives `sort (GNU coreutils) 5.97` – John DB Aug 27 '12 at 05:33
  • 5.97 is rather unlikely to have -V, though you can always try. With perl's sort, you can specify an arbitrary comparison function, with which it would be very simple to pull out the version numbers individually and compare them. – nneonneo Aug 27 '12 at 05:38
  • `/opt/sfw/bin/sort: invalid option -- V` But anyway, can sort handle more than one field separator, which would make this a simple task? – John DB Aug 27 '12 at 05:42
  • @JohnBuehrer: AFAIK, you are limited to one field separator per command line with [`sort`](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html). Think about what it would mean to specify `sort -t . -k 12 -t ' ' -k 3`; it would have to split the line on dots to find the 12th field; then start over and split the line again on spaces to find the 3rd field. You can reliably use the `-k7.1n,7.2` notation to sort on the first (2-digit) and second (1-digit) fields in the example data, but the 3rd or 4th sub-fields are variable width data and fixed offsets don't help enough. – Jonathan Leffler Aug 27 '12 at 05:56
  • Exactly, and that's exactly what I want - sort to figure out where the fields are, based on delimiters which might change along the way. Not rocket science at all, but it's looking like Perl will be needed. For shame, Unix! – John DB Aug 27 '12 at 06:01
  • Well, one of the Unix philosophies is "write simple parts connected by clean interfaces.". In this case, sort is deliberately simple since complicated operations can be done by chaining together multiple commands (as the answer did). – nneonneo Aug 27 '12 at 06:04
1

This isn't the fastest way to do it, but it is fairly simple to explain:

ls -loghF |
awk '{ print $7 " " $0 }' |
sort -t. -k 1,1n -k 2,2n -k3,3n -k 4,4n |
sed 's/^[^ ]* //'

The 'awk' command copies the directory field to the front of the line; the sort command only uses a single delimiter (.; I don't think you can use different delimiters for different parts of a line) and then sorts the 4 numeric parts explicitly in numeric order. Then the sed removes the field that was added at the front.

This is a simple version of 'make it easy for sort to find the keys', because splitting the input is one of the expensive operations in sort.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
1

FYI: Here's what I ended up doing.
Special thanks to: http://www.sysarch.com/Perl/sort_paper.html

$ ls -loghF | perl -e '
use strict;
my @in = <>;
my @out = grep(m|\d+\.\d+\.\d+\.\d+/$|, @in);
print sort {
    my @aa = $a =~
            m|(\d+)\.(\d+)\.(\d+)\.(\d+)/$|;
    my @bb = $b =~
            m|(\d+)\.(\d+)\.(\d+)\.(\d+)/$|;
    $aa[0] <=> $bb[0] or
    $aa[1] <=> $bb[1] or
    $aa[2] <=> $bb[2] or
    $aa[3] <=> $bb[3]
    } @out;
'
drwxr-xr-x   2       3 Jun 18 11:33 12.0.40.0/
drwxr-xr-x   2       3 Jul 28 15:00 13.0.9.1/
drwxr-xr-x   2       3 Jul 13 14:12 13.0.11.4/
drwxr-xr-x   2       3 Jul 26 15:30 13.0.11.5/
drwxr-xr-x   2       4 Jul 27 11:33 13.0.11.6/
drwxr-xr-x   2       3 Aug  3 11:41 13.0.11.7/
drwxr-xr-x   2       3 Aug 10 11:53 13.0.11.8/
drwxr-xr-x   2       3 Aug 17 17:00 13.0.11.9/
drwxr-xr-x   2       3 Aug 24 14:45 13.0.11.10/
drwxr-xr-x   2       3 Aug 29 17:31 13.0.11.11/
drwxr-xr-x   2       3 Aug  3 14:37 13.0.17.0/
drwxr-xr-x   2       3 Aug 13 11:50 13.0.18.0/
drwxr-xr-x   2       3 Aug 17 11:21 13.0.19.0/
John DB
  • 101
  • 1
  • 8
  • The earlier awk solution is better as a one line command, but I use the solution above because I run run in the context of another Perl script, so the Perl stuff is already available. – John DB Aug 31 '12 at 07:09