2

I want to solve this problem:

Given an input string, sort the characters in decreasing order of frequency. If more than one characters had same frequency count, sort them in increasing lexicographical order. Example:

bdca --> abcd,
bdcda -> ddabc,
abba -> aabb,
bacbdc -> bbccad,

My solution involves creating the frequencies in a hash map, sorting the hash map dict items by frequency using sorted() and lambda function. Then for the items with the same frequency (I need to write a subroutine for this), I do another sorted with lambda function.

def string_sort(s):
    hmap = {}
    for char in s:
        if char not in hmap:
            hmap[char] = 1
        else:
            hmap[char] += 1
    freqs = sorted(hmap.items(), key=lambda x: x[1], reverse=True)
    num_occur: list = find_num_occur(freqs)
    sorted_freqs = []
    start_ind = 0
    for f in num_occur:
        tmp_freqs = sorted(freqs[start_ind : start_ind + f], key=lambda x: x[0])
        sorted_freqs.extend(tmp_freqs)
        start_ind = len(sorted_freqs)
    out = []
    for item in sorted_freqs:
        out.extend([item[0]] * item[1])
    return "".join(out)


def find_num_occur(freqs):
    count = 1
    out = []
    for i in range(len(freqs) - 1):
        if freqs[i][1] == freqs[i + 1][1]:
            count += 1
        else:
            out.append(count)
            count = 1
    out.append(count)
    return out

The solution is not elegant. I was told I can solve it easier if using comparators, but I don't know how to use comparators in python. Any suggestions? Or any other more elegant solution?

Thanks.

Stef
  • 13,242
  • 2
  • 17
  • 28
Martin
  • 35
  • 2
  • Does this answer your question? [Using a comparator function to sort](https://stackoverflow.com/questions/12749398/using-a-comparator-function-to-sort) – Serial Lazer Nov 09 '20 at 08:24
  • reimplement `<` you can use `__lt__` method – ZhaoYi Nov 09 '20 at 09:55
  • About vocabulary: "lexicographical order" doesn't make sense for a single character. You probably meant "alphabetical order". "Lexicographical order" is about comparing sequences. – Stef Nov 09 '20 at 12:01

2 Answers2

4

You don't need to use comparators, you can use a key-function just fine. There's a lot of unnecessary complexity in your solution, all you need is something to the effect of:

>>> from collections import Counter
>>> def transmogrify(s):
...     counts = Counter(s)
...     return ''.join(sorted(s, key=lambda c: (-counts[c], c)))
...
>>> transmogrify('bdca')
'abcd'
>>> transmogrify('bdcda')
'ddabc'
>>> transmogrify('abba')
'aabb'
>>> transmogrify('bacbdc')
'bbccad'

Note, a collections.Counter is just a dict that is specialized for counting, since it is a common enough pattern.

>>> Counter('bacbdc')
Counter({'b': 2, 'c': 2, 'a': 1, 'd': 1})
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • Alternatively you could use `Counter.most_common` to re-expand into the originals, since it's going to return the most common element first e.g.`''.join(l*n for l, n in Counter(s).most_common())` – Masklinn Nov 09 '20 at 08:47
  • No, you can't use just the count, consider `'ccbb'`. With what you have, it will simply return `'ccbb'` but it should return `'bbcc'` – juanpa.arrivillaga Nov 09 '20 at 08:48
  • Aye you're right I missed the lexical ordering fallback. – Masklinn Nov 09 '20 at 08:49
  • Thanks for the responses. @juanpa.arrivillaga I guess the problem is that I don't really understand the key function you provided here: `key=lambda c: (-counts[c], c)`. Can you please explain what it means to have these two arguments (-counts[c], c)? Thanks. – Martin Nov 09 '20 at 09:30
  • 1
    @Martin those aren't two arguments. That is a tuple, the first item in the tuple, contains the count of that character (well, negative, because you want to sort in decreasing order and by default, `sorted` sorts in increasing order), the second element contains the character. Tuples are sorted lexicographically. So first it compares the first element of the tuple, and if there is a tie, it comparees the second element (and so on, and so forth, for example `(1, 'b') < (1, 'c')` just like `'ab' < 'ac'`) – juanpa.arrivillaga Nov 09 '20 at 09:36
1

Counter & __lt__ Method

I think your task can be divided into 2 tasks:

  1. count task, this could be implemented easily by Counter class, but Counter cannot guarantee the order of its keys, so you need task 2
  2. sort task (comparators like CPP), you need a custom sort func, so you can create a new class and implement the __lt__ method, doc
from collections import Counter

class Item:
    def __init__(self, ch, times):
        self.ch = ch
        self.times = times
    def __lt__(self, other):
        if self.times > other.times:
            return True
        if self.times == other.times:
            return self.ch < other.ch
        return False
# my restruct 
def restruct(inp):
    c = Counter(inp)
    data = sorted([Item(ch, times) for ch, times in c.items()])
    return ''.join([item.ch*item.times for item in data])

Following restruct a very elegant implement (with the same ability) by @juanpa.arrivillaga, thank him.

def restruct(inp):
    c = Counter(inp)
    return ''.join(sorted(inp, key=lambda ch: Item(ch, c[ch])))

Then try restruct('bacbdc') get bbccad, bingo!!!

ZhaoYi
  • 211
  • 2
  • 8