0

I need to sort a list of strings which contains digits at the beginning and end of the string, first by the beginning digits, then by the ending digits. So the beginning digits have priority over the ending digits.

For example:

    l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']

Would become:

    l = ['900abc5', '900abc20','1000abc5','1000abc10','3000abc10']

I know that l.sort() will not work here as it sorts lexicographically. Any other methods I tried seemed to be excessively complicated (example: splitting the strings by matching beginning digits, then splitting again by ending digits, sorting, concatenating, and then recombining the list) Even summarizing that method shows that it is not efficient!

Edit: after playing around with the natsort module I found that natsorted(l) solves my particular issue.

Jomonsugi
  • 1,219
  • 1
  • 13
  • 21
  • Why did I get downvoted for this? What did I do wrong? I have three people who thought it was worth answering and it is not marked as a duplicate either. I assume this question will be helpful to many. – Jomonsugi Jan 24 '17 at 02:40
  • I didn't downvote you but you haven't shown your attempt. It's a better practice to try to solve a problem yourself and show your attempt than to just say I have a problem, give me a solution. – depperm Jan 24 '17 at 12:51
  • 1
    Understood. Thank you for the advice. I am new around here, but I am finding the vast disparity in questions asked a few years ago vs those asked more recently. I could quickly link you to 100s of questions on this site where people just ask questions, with no example or attempt and they are upvoted 500+. I am sure you have seen plenty yourself. I would call this a double standard, but I assume the community has just become more organized and stringent over the years. Unfortunately, this comes at a cost of being elitist and unhelpful to beginners. I spent hours on my problem before posting. – Jomonsugi Jan 25 '17 at 21:42

3 Answers3

4

You may create a custom function to extract the numbers from string and use that function as a key to sorted().

For example: In the below function, I am using regex to extract the number:

import re

def get_nums(my_str):
    return list(map(int, re.findall(r'\d+', my_str)))

Refer Python: Extract numbers from a string for more alternatives.

Then make a call to sorted function using get_nums() as key:

>>> l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']

>>> sorted(l, key=get_nums)
['900abc5', '900abc20', '1000abc5', '1000abc10', '3000abc10']

Note: Based on your example, my regex expression assume that there will be a number only at the start and the end of the string with all intermediate characters in strings as non-numeric.

Community
  • 1
  • 1
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
1

Here is an option with regex to findout the leading digits and trailing digits and use them as keys in the sorted function:

import re
sorted(l, key = lambda x: (int(re.findall("^\d+", x)[0]), int(re.findall("\d+$", x)[0])))

# ['900abc5', '900abc20', '1000abc5', '1000abc10', '3000abc10']
Psidom
  • 209,562
  • 33
  • 339
  • 356
0

Python's sorted method allows the specification of a key parameter, which should be a function that transform a list's element into a sorting value. In your case, you want to sort by the digits in the string. For example '900abc5', the key would be [900, 5], and so on. So you want to pass in a key function that transform the string into the list of digits.

Using regular expressions, it's quite easy to extract the digits from the string. All you need to do is to map the digits into actual numbers, as regular expressions return string matches.

I believe the code below should work:

import re

l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']

def by_digits(e):
  digits_as_string = re.findall(r"\d+", e)
  return map(int, digits_as_string)

sorted(l, key=by_digits)
phss
  • 1,012
  • 10
  • 22