3

I have a list of names, which are out of order. How can I get them in the correct alphanumeric order, using a custom sort order for the alphabetical part?

My file numbers.txt:

alpha-1
beta-3
alpha-10
beta-5
alpha-5
beta-1
gamma-7
gamma-1
delta-10
delta-2

The main point is that my script should recognize that it should print alpha before beta, and beta before gamma, and gamma before delta.

That is, the words should be sorted based on the order of the letters in the Greek alphabet they represent.

Expected order:

alpha-1
alpha-5
alpha-10
beta-1
beta-3
beta-5
gamma-1
gamma-7
delta-2
delta-10

PS: I tried with sort -n numbers.txt, but it doesn't fit my need.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Daniel
  • 584
  • 3
  • 8
  • 20
  • 1
    Possible duplicate of [Sorting multiple keys with Unix sort](http://stackoverflow.com/questions/357560/sorting-multiple-keys-with-unix-sort) – Some programmer dude Jan 23 '17 at 10:52
  • 1
    As i said before, the main point is that there is a specific order to respect(alpha, beta, gamma and then delta). I think that 'sort -k' wouldnt solve this issue. No? – Daniel Jan 23 '17 at 11:16
  • 1
    I now see clearer what you want to do, and unfortunately it's not really possible to do with plain `sort`. It might be possible using `awk` but the best solution is probably to create a program (in e.g. Python) to do the sorting. – Some programmer dude Jan 23 '17 at 12:11
  • 2
    It's actually possible to do by decorating, sorting, and undecorating. i.e. Adding a prefix based on the line contents (which can be done with sed among others), sorting based on the prefix, then stripping the prefix (the same way you added it) – Hasturkun Jan 23 '17 at 13:19

6 Answers6

2

You can use an auxiliary awk command as follows:

awk -F- -v keysInOrder="alpha,beta,gamma,delta" '
    BEGIN {
        split(keysInOrder, a, ",")
        for (i = 1; i <= length(a); ++i) keysToOrdinal[a[i]] = i
    }
    { print keysToOrdinal[$1] "-" $0 }
' numbers.txt | sort -t- -k1,1n -k3,3n | cut -d- -f2-
  • The awk command is used to:

    • map the custom keys onto numbers that reflect the desired sort order; note that the full list of keys must be passed via variable keysInOrder, in order.

    • prepend the numbers to the input as an auxiliary column, using separator - too; e.g., beta-3 becomes 2-beta-3, because beta is in position 2 in the ordered list of sort keys.

  • sort then sorts awk's output by the mapped numbers as well as the original number in the 2nd column, yielding the desired custom sort order.

  • cut then removes the aux. mapped numbers again.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Please refer to my new question : http://stackoverflow.com/questions/41835799/sort-a-find-command-to-respect-a-custom-order-in-unix – Daniel Jan 24 '17 at 17:58
1

I would reach for Perl here. This script will work:

#!/usr/bin/env perl
use v5.14;          # turn on modern features

# Greek alphabet
my @greek_letters =qw(alpha beta     gamma   delta epsilon zeta
                      eta   theta    iota    kappa lambda  mu
                      nu    xi       omicron pi    rho     sigma
                      tau   upsilon  phi     chi   psi     omega);

# An inverted map from letter name to position number;
# $number{alpha} = 1, $number{beta} = 2, etc:
my %number;
@number{@greek_letters} = 1..@greek_letters;

# Read the lines to sort
chomp(my @lines = <>);

# split on hyphen into arrays of individual fields
my @rows = map { [ split /-/ ] } @lines;

# prepend the numeric position of each item's Greek letter
my @keyed = map { [ $number{$_->[0]}, @$_ ] } @rows;

# sort by Greek letter position (first field, index 0) and then
# by final number (third field, index 2)
my @sorted = sort {   $a->[0] <=> $b->[0]
                   || $a->[2] <=> $b->[2] } @keyed;

# remove the extra field we added
splice(@$_, 0, 1) for @sorted;

# combine the fields back into strings and print them out
say join('-', @$_) for @sorted;

Save the Perl code into a file (say, greeksort.pl) and run perl greeksort.pl numbers.txt to get your sorted output.

Mark Reed
  • 91,912
  • 16
  • 138
  • 175
1

Here's a Python solution. Don't try to do hard things with Bash, sed, awk. You can usually accomplish what you want, but it'll be more confusing, more error prone, and harder to maintain.

#!/usr/bin/env python3

# Read input lines
use_stdin = True
if use_stdin:
    import sys
    lines = sys.stdin.read().strip().split()
else:
    # for testing
    with open('numbers.txt') as input:
        lines = input.read().strip().split()

# Create a map from greek letters to integers for sorting
greek_letters = """alpha beta     gamma   delta epsilon zeta
                   eta   theta    iota    kappa lambda  mu
                   nu    xi       omicron pi    rho     sigma
                   tau   upsilon  phi     chi   psi     omega"""
gl = greek_letters.strip().split()
gl_map = {letter:rank for rank, letter in enumerate(gl)}

# Split each line into (letter, number)
a = (x.split('-') for x in lines)
b = ((s, int(n)) for s,n in a)

# Using an order-preserving sort, sort by number, then letter
by_number = lambda x: x[1]
by_greek_letter = lambda x: gl_map.get(x[0])
c = sorted(sorted(b, key=by_number), key=by_greek_letter)

# Re-assemble and print
for s,n in c:
    print('-'.join((s, str(n))))
Harvey
  • 5,703
  • 1
  • 32
  • 41
0

Generic solution: sort -t- -k 1,1 -k 2,2n numbers.txt

Below script will work for custom requirement. It is not the best solution. Result will be again stored in numbers.txt

#!/bin/bash

sort -t- -k 1,1 -k 2,2n numbers.txt > new_test.txt
while IFS= read -r i
do 
    if [[ $i == *"delta"* ]] 
    then 
        echo $i >> temp_file
    else 
        echo $i >> new_numbers.txt
    fi 
done < new_test.txt
cat temp_file >> new_numbers.txt
cat new_numbers.txt > numbers.txt

rm -rf new_test.txt
rm -rf temp_file 
rm -rf new_numbers.txt
Utsav
  • 5,572
  • 2
  • 29
  • 43
  • Thanks, but it still print `delta` before `gamma` :/ – Daniel Jan 23 '17 at 12:03
  • 1
    Because `delta` comes before `gamma` alphabetically. If you're looking for something that recognizes Greek *letter names written in the Latin alphabet* and sorts them in Greek alphabetical order, I think you're likely to have to write something yourself. – Mark Reed Jan 23 '17 at 14:59
  • The updated answer now handles just one special case - this will be very cumbersome to generalize, and is also quite inefficient. – mklement0 Jan 23 '17 at 16:02
  • Provided both generic and custom solution and replaced for with while loop. – Utsav Jan 23 '17 at 16:03
  • A generic solution is one that takes an arbitrary list of sort keys in order, and sorts by them. Your solution hard-codes a _single_ exception to alphabetic sorting. Thanks for fixing the `for` loop problem. – mklement0 Jan 23 '17 at 16:06
0

If you have access to awk and sed then try this

Adding changes for Greek ordering..

cat test.txt | awk -F "-" '{ printf "%s-%0100i\n" , $1, $2 }' | \
sed 's/^alpha-\(.*\)$/01-\1/' | \
sed 's/^beta-\(.*\)$/02-\1/'  | \
sed 's/^gamma-\(.*\)$/03-\1/' | \
sed 's/^delta-\(.*\)$/04-\1/' | \
sort | \
sed 's/\(.*\)-\([0]*\)\(.*\)/\1-\3/' | \
sed 's/^01-\(.*\)$/alpha-\1/' | \
sed 's/^02-\(.*\)$/beta-\1/'  | \
sed 's/^03-\(.*\)$/gamma-\1/' | \
sed 's/^04-\(.*\)$/delta-\1/' 
london-deveoper
  • 533
  • 3
  • 8
  • 1
    This works, but is quite inefficient due to the multiple `sed` commands; note that you can combine them all into a _single_ `sed` script, with `s` calls separated with `;` You don't need to `0`-pad the numbers in the input to achieve numerical sorting; instead, use `sort` with the `-n` option (field-selectively): `sort -t- -k 1,1n -k 2,2n`. If you didn't _replace_ the words before sorting with their numerical mapping, but _prepended_ the mapped numbers as a (temporary) 1st field, then all you'd need to do is to remove that field after sorting, using `cut -d- -f2-` - no need for `sed` again – mklement0 Jan 23 '17 at 17:10
0

Don't try to do hard things with Bash, sed, awk

yeah, use an actuall shell and non-gnu userland commands. not much easier to code in the first place but at least won't be prone to random bugs introduced by idiotic maintainers who do not have a clue regarding backwards compatibility

  • It looks like this should probably be a comment on the answer you're referencing, rather than an entirely new answer. – SamuelMS Jun 20 '18 at 13:32