python: grouping based on frequency of occurrence

Question

I have labels and their frequencies(ie number of times they are repeated) for a dataset.

Is there a library which can be used to group together those labels which have almost similar frequency(ie based on variation).

As an example: Suppose a is repeated 10 times, b 9 times, c 6 times, d 5 times, e 2 times So I want and b fall into one group, c and d in one group and e in another group.

Please provide exact rules that you need the grouping to be based on — kosnik, Nov 04 '19 at 16:08

score 0 · Answer 1 · answered Nov 04 '19 at 17:41

0

You can use the following function to group based upon count.

def group_labels(cnts): 
  d = {} 
  for k, v in cnts.items(): 
    d.setdefault(v, []).append(k)
  return sorted(d.values(), key=lambda x: x[0]) # sorted by first label

Example

cnts = {'a': 4, 'b': 15, 'c':4, 'd':16, 'e':1, 'f':16}
print(group_labels(cnts))
[['a', 'c'], ['b'], ['d', 'f'], ['e']]

answered Nov 04 '19 at 17:41

DarrylG

16,732
2
17
23

Thanks for input. I actually need is to group those elements which fall within a range OR difference between them is within a given limit. – Mandroid Nov 05 '19 at 02:56
1

@Mandroid--is your goal to cluster labels based upon a value [similar to this problem](https://stackoverflow.com/questions/18364026/clustering-values-by-their-proximity-in-python-machine-learning)? – DarrylG Nov 05 '19 at 04:12
Exactly. Thanks a lot. – Mandroid Nov 05 '19 at 04:33

python: grouping based on frequency of occurrence

1 Answers1