0

This question is based on previous Stackoverflow answer. The problem seems too big for me.

I now have sorting function sort_multi(lst, index_normal, index_reversed) which takes input list, and two parameters which indicates which column is sorted in normal order and which in reverse:

from pprint import pprint
from itertools import groupby, chain

l = ((1, 'b'),
     (1, 'd'),
     (2, 'a'),
     (1, 'a'))

def sort_multi(lst, index_normal, index_reversed):
    return list(chain.from_iterable([sorted(list(j), key=lambda v:v[index_reversed], reverse=True) for i, j in groupby(sorted(lst), key=lambda v:v[index_normal])]))

pprint(sort_multi(l, 0, 1), width=10)  # index 0 is sorted normally, 1 in reversed order.

Output:

[(1, 'd'),
 (1, 'b'),
 (1, 'a'),
 (2, 'a')]

What I want is extend the function to work for any number of indices. Eg.

For example input of 3 columns:

l = ((1, 'c', 'aa'),
     (2, 'b', 'bb'),
     (1, 'c', 'cc'),
     (1, 'd', 'dd'))

sort_multi(l, reversed=(False, True, True))

Gives output:

[(1, 'd', 'dd'),
 (1, 'c', 'cc'),
 (1, 'c', 'aa'),
 (2, 'b', 'bb')]

The tuples can contains strings, integers, ...

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • But doesn't the [answer on the previous question](https://stackoverflow.com/a/51252873/953482) already do this? Can't you pass however many values you want to the `ascending` argument? – Kevin Jul 10 '18 at 12:38
  • I don' t want to use pandas, just builtin modules. I'm not sure if it's possible to do it efficiently (maybe with recursion?) – Andrej Kesely Jul 10 '18 at 12:44

2 Answers2

2

Recursive approach: sort the list by the first criteria. Then group by that criteria, and recursively sort each sub-group using the remaining criteria.

from operator import itemgetter
import itertools

def multisorted(seq, indices, reversed_states):
    if len(indices) == 0:
        return seq
    index = indices[0]
    reversed = reversed_states[0]
    partially_sorted_seq = sorted(seq, key = itemgetter(indices[0]), reverse=reversed_states[0])
    result = []
    for key, group in itertools.groupby(partially_sorted_seq, key=itemgetter(indices[0])):
        result.extend(multisorted(group, indices[1:], reversed_states[1:]))
    return result

d = ((1, 'c', 'aa'),
     (2, 'b', 'bb'),
     (1, 'c', 'cc'),
     (1, 'd', 'dd'))

print(multisorted(d, (0,1,2), (False, True, True)))

Result:

[(1, 'd', 'dd'), (1, 'c', 'cc'), (1, 'c', 'aa'), (2, 'b', 'bb')]

Approach #2. sort and sorted can already sort by multiple criteria out-of-the-box. We only need to change the key so that some of the elements of each tuple have reversed logic in regards to inequality comparison.

from functools import total_ordering

@total_ordering
class Negated:
    def __init__(self, value):
        self.value = value
    def __eq__(self, other):
        return self.value == other.value
    def __lt__(self, other):
        return self.value > other.value

def multisorted(seq, ordering):
    return sorted(seq, key = lambda t: [Negated(x) if reversed else x for x, reversed in zip(t, ordering)])

d = ((1, 'c', 'aa'),
     (2, 'b', 'bb'),
     (1, 'c', 'cc'),
     (1, 'd', 'dd'))

print(multisorted(d, (False, True, True)))
Kevin
  • 74,910
  • 12
  • 133
  • 166
2

You can do something way simpler like:

from operator import itemgetter

def sort_any_multi(lst, index_order_reversed):
    for index, reverse in reversed(index_order_reversed):
       lst = sorted(lst, key=itemgetter(index), reverse=reverse)
    return lst

r = ((1, 'c', 'aa'),
     (2, 'b', 'bb'),
     (1, 'c', 'cc'),
     (1, 'd', 'dd'))

print(sort_any_multi(r, [[0, False], [1, True], [2, True]]))

Gives off:

[(1, 'd', 'dd'), (1, 'c', 'cc'), (1, 'c', 'aa'), (2, 'b', 'bb')]

Trick is in sorting from the last priority index to the highest priority one.

Dusan Gligoric
  • 582
  • 3
  • 21
  • Yep, this is a perfectly serviceable approach. I suspect it's just a little inefficient since each sort goes over the entire list, where a recursive approach might call more sorts, but on smaller subsets. It's hard to formally prove that one has a better complexity than the other, though. – Kevin Jul 10 '18 at 14:15
  • 1
    @Kevin Ran 10 tests for 1 million 3 dimension integer array, code I pasted was 25% faster to my amazement. Your first example averaged around 4s, 2nd around 9s, mine did around 3s. But yes, estimations, estimations, I guess if not heavy production one should rely on what he can understand best. – Dusan Gligoric Jul 10 '18 at 14:32