0

Does anyone know how to sort rows to ["D9", "D10", "E9P", "E10P"] ? I want to sort by the preceding alphabet first and then sort by number inside.

In [2]: rows
Out[2]: ['D10', 'D9', 'E9P', 'E10P']

In [3]: sorted(rows)
Out[3]: ['D10', 'D9', 'E10P', 'E9P']


1. I can sort 9 ahead of 10 like this.
In [9]: sorted(rows, key=lambda row: int(re.search('(\d+)', row, re.IGNORECASE).group(1)))
Out[9]: ['D9', 'E9P', 'D10', 'E10P']

2. This doesn't work for me
In [10]: sorted(rows, key=lambda row: (row, int(re.search('(\d+)', row, re.IGNORECASE).group(1))))
Out[10]: ['D10', 'D9', 'E10P', 'E9P']
Jon
  • 215
  • 3
  • 9
  • 1
    [natsort](https://pypi.org/project/natsort/) is good for this. Here is a [SO](https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) about this – han solo Oct 04 '19 at 18:14

3 Answers3

1

This will take any amount of characters at the front, and any amount of numbers after that.

def key(x):
    alpha, num_str = re.match(r'([A-Z]+)(\d+)', x).groups()
    num = int(num_str)
    return (alpha, num)

>>> sorted(["AC40", "AB55", "D9", "D10", "E9P", "E10P"], key=key)
['AB55', 'AC40', 'D9', 'D10', 'E9P', 'E10P']
Evan
  • 2,120
  • 1
  • 15
  • 20
0

Extending what you already have, you could use row[0] instead of row as your primary sort key;

In [8]: sorted(rows, key=lambda row: (row[0], int(re.search('(\d+)', row, re.IGNORECASE).group(1))))
Out[8]: ['D9', 'D10', 'E9P', 'E10P']
fuglede
  • 17,388
  • 2
  • 54
  • 99
  • This could fail to work if there are multiple letters before the first number, since this only looks at the first letter. – SyntaxVoid Oct 04 '19 at 18:19
  • Thanks alot, fuglede SyntaxVoid, Thank you, too. I will play around this more. – Jon Oct 04 '19 at 18:21
0

You could do:

lst = ["D9", "D10", "E9P", "E10P"]

def keys(val):
    first = val[0]
    number = int(''.join(filter(str.isdigit, val)))
    return first,  number 

result = sorted(lst, key=keys)
print(result)

Output

['D9', 'D10', 'E9P', 'E10P']

Or if you want to use regex:

def keys(val):
    first = val[0]
    number = int(re.search('\d+', val).group())
    return first, number

Or also using regex:

def keys(val):
    alpha, digits = re.search('^([^\d]+)(\d+)', val).groups()
    return alpha, int(digits)

This last function has the advantage it accommodates multiple non-digits characters at the beginning of the string.

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76