What is the best way to sort list with custom sorting parameters in Python?

Question

I have a series of lists that looks like this:

li1 = ['a.1', 'b.9', 'c.8', 'd.1', 'e.2']
li2 = ['a.4', 'b.1', 'c.2', 'd.2', 'e.4']

How can I rearrange the items in each list so that the first item is 'b.something'? For the example above:

li1 = ['b.9', 'a.1', 'c.8', 'd.1', 'e.2']
li2 = ['b.1', 'a.4', 'c.2', 'd.2', 'e.4']

Maintaining the order after the first item isn't important. Thanks for the help.

I'm not sure, but this question might be of interest to you: http://stackoverflow.com/questions/2436607/how-to-use-re-match-objects-in-a-list-comprehension — Chris Pratt, Nov 03 '11 at 14:57
just wondering why everybody used `s.startwith('b')`, instead of `s[0] == 'b'`. is there any performance advantage? if not i want to save my brain's long-term memory use. — yosukesabai, Nov 03 '11 at 15:01
@yosukesabai: Only one answer uses `s.startwith('b')`. The others use `s.startwith('b.')` like was asked for in the question. — Ignacio Vazquez-Abrams, Nov 03 '11 at 15:05
@IgnacioVazquez-Abrams: sorry, then, what i meant to say is `s.startwith('b.')` vs `s[:2] == 'b.'`. is the first superior to the second, and if so in which sense? — yosukesabai, Nov 03 '11 at 15:12
@yosukesabai: Well, for single elements, `startswith` doesn't crash on empty strings. For multiple characters, I don't see any advantage apart from reading nicer (well, you can specify start/stop and alternatives, but that's not needed here). — , Nov 03 '11 at 15:20
There is an advantage in `startswith`, which is not to take care about the number of characters to slice from the string (and therefore to avoid dumb index errors). — Joël, Nov 03 '11 at 15:40

score 5 · Answer 1 · answered Nov 03 '11 at 14:55

5

Python's sorting is stable, so you will maintain the order after the first item regardless.

li1.sort(key=lambda x: not x.startswith('b.'))

answered Nov 03 '11 at 14:55

Ignacio Vazquez-Abrams

776,304
153
1,341
1,358

1

This is very slick, but I imagine that some would argue that "explicit is better than implicit" means you shouldn't treat booleans as comparable values (`False < True`), and thus would expand to something like `0 if x.startswith('b.') else 1`. I'm not one of those people, though. :) – Karl Knechtel Nov 03 '11 at 17:52

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

4

rearrange the items in each list so that the first item is 'b.something'

Maintaining the order after the first item isn't important.

That isn't sorting, then. Conceptually, you're just trying to bring that element to the front.

In other words, you want a list that consists of that element, followed by everything that isn't that element. Fudging this a little for the case where there are multiple b.somethings, and noting that we don't care what happens as long as the first element is a b.something, we can rephrase that: a list of every element meeting the condition ("starts with b."), followed by every element not meeting the condition. (This is sometimes called partitioning; see for example std::partition in C++.)

In Python, this is as simple as describing those two list-components with list comprehensions and sticking them together:

[x for x in li if x.startswith('b.')] + [x for x in li if not x.startswith('b.')]

...Or you can pretend that you're sorting, just with a bunch of elements that really only have two values after the key is applied, and apply the appropriate key, as in Ignacio Vasquez-Abrams' answer.

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 03 '11 at 14:55

Karl Knechtel

62,466
11
102
153

That's the closest to the request. I don't know how fast it could be regarding to `sort` anyway, as you have to read the list twice with same function for filtering. – Joël Nov 03 '11 at 14:59
wow, this is cool. for some reason joining list comprehensions together never occurred to me. thanks. – drbunsen Nov 03 '11 at 15:22
@Joël reading the list twice only causes constant-factor overhead. Sorting is O(n lg n), since `sort` needs to be able to sort arbitrarily and thus must do some kind of comparison-based sort, while each filtering list-comprehension pass is trivially seen to be O(n). – Karl Knechtel Nov 03 '11 at 17:50
1

[Python sorting](http://en.wikipedia.org/wiki/Timsort) is O(n lg n) in the worst case; on "mostly sorted" data such as this (elements that result in the same key are already sorted since Python's sorting is stable), complexity approaches O(n). – Ignacio Vazquez-Abrams Nov 03 '11 at 18:48

Joël · Answer 3 · 2011-11-03T15:38:34.497

You can use sorted, that accepts a key argument, and returns a list:

>>> li1 = ['a.1', 'b.2', 'c.8']
>>> def k(s):
...     if s.startswith('b.'):
...         return 1
...     else:
...         return 2
...
>>> sorted(li1, key=k)
['b.2', 'a.1', 'c.8']

k shall return something that can be compared between iterable items.

Note: sort change the input in-place and returns nothing, when sorted returns the sorted list and does not modify your list. Both work the same way.

What is the best way to sort list with custom sorting parameters in Python?

3 Answers3