3

I have the following list:

lst = ['AAAAAABB', 'AAAAAABA', 'AAAAAAAB', 'AAAAABAA', 'AAAAABAB', 'AAAAABBA']

And I'm getting this:

lst = ['AAAAAAAB', 'AAAAAABA', 'AAAAAABB', 'AAAAABAA', 'AAAAABAB', 'AAAAABBA']

But I want something like this:

lst = ['AAAAAAAB', 'AAAAAABA', 'AAAAABAA', 'AAAAAABB', 'AAAAABAB', 'AAAAABBA']

It means, I want to sort my list by Python considering the sequences of A and B in each list item.

Actually, I want that all the combinations with 7*A should appear first. After that should come the combinations with more than one B.

double-beep
  • 5,031
  • 17
  • 33
  • 41

2 Answers2

2

Edit: I always forget this comment, but it is really important, DO NOT CALL A VARIABLE WITH NAMES AS "list, dict" etc, those name has special meaning for python

Edit: you are sorting by amount of "b"

You can sort the list:

import functools

lst = ['AAAAAABB', 'AAAAAABA', 'AAAAAAAB', 'AAAAABAA', 'AAAAABAB', 'AAAAABBA']

def sort_by_b(a,b):
  ab = a.count('B')
  bb =  b.count('B')
  if (ab == bb and a < b) or ab < bb:
    return -1
  elif ab > bb:
    return 1
  else:
    return a == b

print(sorted(lst, key=functools.cmp_to_key(sort_by_b)))

as result:

['AAAAAAAB', 'AAAAAABA', 'AAAAABAA', 'AAAAAABB', 'AAAAABAB', 'AAAAABBA']
developer_hatch
  • 15,898
  • 3
  • 42
  • 75
2

I think I get what you are trying to do! Correct me if I am wrong. Basically it depends on B, isn't it? The more number of times B appears or closer is a B to the starting of a string, further away it appears in the final answer list, isn't it?

So, here is what I do.

  • First, I get a list in which the strings has been sorted in the order of the frequency of B.
  • This is not sufficient, as the order of the strings matter in the original list. For seeing this remove the for loop from my answer and print lst, you will clearly see the issue.
  • So now I need to sort(the normal lexicographic sort) each group of strings which have the same frequency of B. I used groupby for this. See the uses of groupby and sorted each such group and appended them to the final answer.

    from itertools import groupby
    
    lst = ['AAAAAABB', 'AAAAAABA', 'AAAAAAAB', 'AAAAABAA', 'AAAAABAB', 'AAAAABBA']
    ans = []
    lst.sort(key=lambda x:x.count('B'))
    for i,j in groupby(lst, lambda x:x.count('B')):
        ans.extend(sorted(list(j)))
    
    print ans 
    

Output:

['AAAAAAAB', 'AAAAAABA', 'AAAAABAA', 'AAAAAABB', 'AAAAABAB', 'AAAAABBA']
Miraj50
  • 4,257
  • 1
  • 21
  • 34