0

I want to find a way to sort strings that have numbers in them by their numerical size.

I found one way to sort strings that contain only numbers, which works well (Sorting numbers in string format with Python) but not when the string is a mix of words and numbers.

In this example I am creating the list in the order that I want, but the sorted() ruins it.

>>> s = ['A_10x05', 'A_10x50', 'A_10x100']
>>> print(sorted(s))
['A_10x05', 'A_10x100', 'A_10x50']

Expected output

['A_10x05', 'A_10x50', 'A_10x100']

A more complex example would be:

>>> s = ['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02',
'Asset_Castle_Wall_25x400x10_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']
>>> print(sorted(s))
['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01', 'Asset_Castle_Wall_25x400x50_Top_02']

Expected output:

['Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02', 'Asset_Castle_Wall_25x400x100_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']

I am thinking I would need to split the string by numbers and sort each part, and I can sort the number parts using the solution above. Then where there are multiple strings that start the same way i.e section[i] = ('A_') I sort section[i+1] and work my way to the end. This feels very complicated though, so maybe there is a better way.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
optagon
  • 11
  • 4

3 Answers3

1

IIUC, you are trying to multiply the numbers in 10x05 - which you can do by passing a key function to sorted

def eval_result(s):
    prefix, op = s.split('_')
    num1, num2 = map(int, op.split('x'))
    return num1 * num2
sorted(s, key=eval_result)

Output

['A_10x05', 'A_10x50', 'A_10x100']
Mortz
  • 4,654
  • 1
  • 19
  • 35
  • No the numbers are measurements, and there might be more of them. Like the size of an object in in XY or XYZ 100x50x50. Unity sorts filenames based on how big the number is in the string and this is what I want to imitate. A filename might be "Gen_A_Wall_50x100x20_Base_01". I was just keeping my initial example as short as possible so it was easy to read. – optagon Aug 25 '22 at 08:55
  • Your question is not clear enough. It would help if you provide a larger / varied set of sample inputs and what the expected sort order is. – Mortz Aug 25 '22 at 09:54
  • 1
    I solved it, by padding all numbers to match the maximum string length, 50 > 050 if the longest number is 3 digits. Then sort the original list using the proxy list of padded names and python gives them to me in the correct order. Posted the solution in the original post. – optagon Aug 25 '22 at 16:39
0

Providing each string in the list contains exactly three dimensions:

import re
from functools import cache

s = ['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02',
'Asset_Castle_Wall_25x400x10_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']

@cache
def get_size(s):
    if len(tokens := s.split('x')) != 3:
        return 0
    first = re.findall('(\d+)', tokens[0])[-1]
    last = re.findall('(\d+)', tokens[-1])[0]
    return int(first) * int(tokens[1]) * int(last)

print(sorted(s, key=get_size))

Output:

['Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02', 'Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01']
DarkKnight
  • 19,739
  • 3
  • 6
  • 22
0

I believe what you want is just to sort each part of the input strings separately - text parts alphabetically, numeric parts by numeric value, with no multiplications involved. If this is the case you will need a helper function:

from re import findall

s = ['A_10x5', 'Item_A_10x05x200_Base_01', 'A_10x100', 'B']

def fun(s):
    f = findall(r'\d+|[A-Za-z_]+',s)
    return list(map(lambda x:int(x) if x.isdigit() else x, f))

sorted(s, key = fun)
['A_10x5', 'A_10x100', 'B', 'Item_A_10x05x200_Base_01']
gimix
  • 3,431
  • 2
  • 5
  • 21