I'm trying to sort a string of letters and numbers alphanumerically in an "intuitive"/natural way using the unix sort
command, but cannot get it to sort properly. I have this file:
$ cat ~/headers
@42EBKAAXX090828:6:100:1699:328/2
@42EBKAAXX090828:6:10:1077:1883/2
@42EBKAAXX090828:6:102:785:808/2
I'd like to sort it alphanumerically, where intuitively @42EBKAAXX090828:6:10:...
is first (since 10
is smaller than 100
and 102
), second is @42EBKAAXX090828:6:100...
and third is @42EBKAAXX090828:6:102:204:1871/2
.
I know that suggest sorting on a particular position within the line, but the position of the :
here could vary and so this would not be a general and workable solution here.
I tried:
sort --stable -k1,1 ~/headers > foo
with various combinations of -n
and -u
parameters but it does not give the correct ordering.
How can this be done efficiently, either from bash using sort
or from Python? I'd like to apply this to files that are round 4-5 GB in size, so containing millions of lines.
Thanks!