1

I have a file with multiple numbers separated by commas on each line:

1,13,2,5
1,3
2,3,24
1,13,2,6

This is just a small example file. In the real file, each line could have hundreds of numbers.

How can these be sorted numerically by field? The result should be:

1,3
1,13,2,5
1,13,2,6
2,3,24

I tried sort -n -t,, but that compares lines as numbers, producing the wrong result.

Jon
  • 3,573
  • 2
  • 17
  • 24
MrMartin
  • 411
  • 5
  • 18
  • 1
    Tell sort which keys to use: https://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort – Poshi Aug 02 '18 at 09:26
  • How can I do that when the list on each line can contain hundreds of numbers, and I want to sort by all of them? – MrMartin Aug 02 '18 at 09:58
  • Also, the number of keys varies by line – MrMartin Aug 02 '18 at 10:00
  • First, you asked to sort by field, so choose your field and sort by that. Second, if the number of keys varies by line, how can you sort over a key that is inexistent on some lines: – Poshi Aug 02 '18 at 10:06
  • 1)I changed the title to reflect your comment, 2)sort with the assumption is that null – MrMartin Aug 02 '18 at 10:15

1 Answers1

4

This is actually quite a subtle problem, to do with the way sort handles numeric fields. The upshot is that you need to explicitly tell sort to sort numerically on each key field:

sort -t, -k1,1n -k2,2n -k3,3n -k4,4n

If you don't do that, the info section for GNU sort says, slightly paraphrased,

sort would have used all characters beginning in the [first] field and extending to the end of the line as the primary numeric key. For the large majority of applications, treating keys spanning more than one field as numeric will not do what you expect.

which neatly summarises what you saw!

Obviously specifying keys explicitly is going to make sort inconvenient to use on files with arbitrarily long lists of numbers on each line. As a hack, you could try GNU sort with its version sort option, -V.

sort -V

which appears to do the right thing on your particular data. I've tested sort -V on lines with 600 numbers, and it works fine.

Mike Doe
  • 16,349
  • 11
  • 65
  • 88
Jon
  • 3,573
  • 2
  • 17
  • 24
  • So your answer only works if all fields are present, and I mention them explicitly in the sort parameters. I'm sorry if my question wasn't clear, but that's not what I'm asking – MrMartin Aug 02 '18 at 10:16
  • @MrMartin It turns out `sort -V` works on your data. I'd never heard of the `-V` option before now. Thanks for the great question! – Jon Aug 02 '18 at 11:26
  • Amazing answer! – MrMartin Aug 03 '18 at 13:09