0

When I try this, I can't get the result I'm after -

>>> test = { '3 Silver', '3 Oct', '4AD', '99 Reese', '1991', 'alpha', 'beta' }
>>> sorted(test)
['1991', '3 Oct', '3 Silver', '4AD', '99 Reese', 'alpha', 'beta']

This is not correct, because 1991 is the highest entry beginning with a numeric and should appear before alpha

Does anyone have any suggestions on how I could sort this the way I would like?

abcd
  • 10,215
  • 15
  • 51
  • 85
openCivilisation
  • 796
  • 1
  • 8
  • 25

2 Answers2

1

If you want to sort the items by considering the numerical values first(there are edge cases to consider, but should point you to the right direction):

from itertools import takewhile, dropwhile

test = ['3 Silver', '3 Oct', '4AD', '99 Reese', '1991', 'alpha', 'beta']

items = dict()
for word in test:
    ordlist  = []
    ## prenumber will be zero if there are no numerical characters
    prenumber = int(''.join(list(takewhile(lambda i: i.isdigit() , word))) or 0)
    ## setting words that start with alpha characters to have infinity as 
    ## first item. This puts them at the end of the list for sorting. 
    ordlist.append(prenumber or float("inf"))
    ordlist.extend((ord(ch) for ch in dropwhile(lambda i: i.isdigit(), word)))
    items[word] = ordlist

### sort dictionary by value
s = sorted(zip(items.values(), items.keys()))
print(s)
## [([3, 32, 79, 99, 116], '3 Oct'),
##    ([3, 32, 83, 105, 108, 118, 101, 114], '3 Silver'),
##    ([4, 65, 68], '4AD'),
##    ([99, 32, 82, 101, 101, 115, 101], '99 Reese'),
##    ([1991], '1991'),
##    ([inf, 97, 108, 112, 104, 97], 'alpha'),
##    ([inf, 98, 101, 116, 97], 'beta')]

test_sorted = [e[1] for e in s]
## ['3 Oct', '3 Silver', '4AD', '99 Reese', '1991', 'alpha', 'beta']
Dean
  • 138
  • 2
  • 7
  • This works well, thanks! But it removes duplicate entries so the number of the items on output vs the input is not necessarily the same. – openCivilisation Apr 25 '15 at 07:26
0

Yes you can do it but you'll have to create your own "scoring" system that'll create the order that you want:

import re

def score(token):
    n = re.sub(r'\D+', '', token)
    if n:
        n = int(n)
    w = re.sub(r'[\d+ ]', '', token)
    return n, w #returning a list/tuple with the most important criteria on the first place, 2nd on the second place, etc



arr = ['3 Silver', '3 Oct', '4AD', '99 Reese', '1991', 'alpha', 'beta']
print sorted(arr, key=score) # ['3 Oct', '3 Silver', '4AD', '99 Reese', '1991', 'alpha', 'beta']
Nir Alfasi
  • 53,191
  • 11
  • 86
  • 129
  • Thanks for the suggestions, but this isn't a catch all rule you've described, I can see you will only catch up to 4 numerics here. – openCivilisation Apr 18 '15 at 05:44
  • @user1692999 I just learned this new trick: returning a list/tuple with the most important criteria on the first place, 2nd on the second place, etc - will give you exactly what you wanted in a very elegant way without limiting you to certain restrictions on the input. See updated `score()` function above! – Nir Alfasi Apr 19 '15 at 05:25