6

My Python List of string is something like x but long enough:

x = ['aaa','ab','aa','c','a','b','ba']      

I wants to sort this list as: ['a', 'b', 'c', 'aa', 'ab', 'ba', 'aaa'] and I did as follows in two steps:

>>> x.sort()   
>>> x.sort(key=len)      
>>> x
['a', 'b', 'c', 'aa', 'ab', 'ba', 'aaa']   

But I need in one-step: I also tied using lambda function (taken help):

>>> x.sort(key=lambda item: (item, len(item)))
>>> x
['a', 'aa', 'aaa', 'ab', 'b', 'ba', 'c']  

But not as I desired:

Is it possible in one-step? Please me.

My Python:

~$ python --version  
Python 2.6.6
Community
  • 1
  • 1
Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208

2 Answers2

9

You got the order of the tuple the wrong way round. When Python sorts on tuples, the first value is the main sort, with the second being the subsort, etc... - your code presumes the opposite order.

You want to sort by length, then alphabetically:

>>> x.sort(key=lambda item: (len(item), item))
>>> x
['a', 'b', 'c', 'aa', 'ab', 'ba', 'aaa']

Edit: As DSM points out in the comments, Python sorts letters as capitals first, then lowercase. If this behaviour isn't wanted, see this answer.

Community
  • 1
  • 1
Gareth Latty
  • 86,389
  • 17
  • 178
  • 183
  • This actually doesn't sort alphabetically due to case issues (but that's easily fixed.) [I admit the OP's list is all lowercase at the moment.] – DSM Dec 31 '12 at 16:06
  • @DSM I was emulating the behaviour of the original - supposedly - working code, it could be a bug that capitals will come first, but that could be the desired behaviour, it's unclear. – Gareth Latty Dec 31 '12 at 16:08
  • @Lattyware : My work done, And I understood. But A last doubt. How first 2-step solution works? I sorted by alphabetic then length!.Thats the reason i could not think of what you tried.. – Grijesh Chauhan Dec 31 '12 at 16:15
  • 1
    Your first solution works because when Python sorts on a key, and values are equal, it does a stable sort - that is, the order of equal values is the order they were in the input. This means that if you sort by one value, then by another, the first sort *falls through* as a subsort. – Gareth Latty Dec 31 '12 at 16:19
  • @Lattyware I read so many time in python book this term `stable sort` but I understood the meaning today.. Thanks Now completely got it. – Grijesh Chauhan Dec 31 '12 at 16:27
1

using itertools.grouby():

In [29]: lis = ['aaa','ab','aa','c','a','b','ba']
In [30]: list(chain(*[sorted(g) for k,g in groupby(sorted(lis,key=len),key=len)]))
Out[30]: ['a', 'b', 'c', 'aa', 'ab', 'ba', 'aaa']

timeit comparisons:

In [38]: x = ['aaa','ab','aa','c','a','b','ba']*1000

In [39]: random.shuffle(x)

#may be in more tricky test cases this would be fast

In [40]: %timeit sorted(x,key=lambda item: (len(item), item))
100 loops, best of 3: 11.3 ms per loop

In [41]: %timeit list(chain(*[sorted(g) for k,g in groupby(sorted(x,key=len),key=len)]))
100 loops, best of 3: 7.82 ms per loop
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • 2
    This seems massively overcomplicated. Not wrong, but... I don't really see the value of it. – Gareth Latty Dec 31 '12 at 16:19
  • 1
    I like `itertools` as much as the next guy, but this is whoa-crazy-crazy! PS: you don't need the `list` call, `sorted` will consume `g` and produce a list. – DSM Dec 31 '12 at 16:22
  • @DSM Forgot to remove the `list()` call, was using it for testing purpose. I know it's a weird way compared to lattyware's answer, but peformance wise this was faster. – Ashwini Chaudhary Dec 31 '12 at 16:28
  • 1
    @AshwiniChaudhary nice simulation..this aptitude is very good and required..thanks again! – Grijesh Chauhan Dec 31 '12 at 16:54