2

I have the following dictionary:

d = {"a":["MRS","VAL"],"b":"PRS","c":"MRS","d":"NTS"}

I would like to create a dictionary which gives the occurence of each values. Basically, it would look like:

output = {"MRS":2,"PRS":1,"NTS":1,"VAL":1}

Does anyone know how I could do that ? Thanks in advance !

Jb_Eyd
  • 635
  • 1
  • 7
  • 20
  • 1
    The structure of your dictionary is weird. Why are the values not always in lists? This makes it more difficult to handle. `d = {"a":["MRS","VAL"], "b":["PRS"], "c":["MRS"], "d":["NTS"]}` would be preferable. – Tim Pietzcker Nov 24 '15 at 18:09

5 Answers5

8

Since your dict is composed of both strings and lists of strings, you first need to flatten those elements to a common type of string:

import collections
d = {"a":["MRS","VAL"],"b":"PRS","c":"MRS","d":"NTS"}

def flatten(l):
    for el in l:
        if isinstance(el, collections.Iterable) and not isinstance(el, basestring):
            for sub in flatten(el):
                yield sub
        else:
            yield el

>>> list(flatten(d.values()))
['MRS', 'VAL', 'MRS', 'PRS', 'NTS']

Then you can then use a Counter to count the occurrences of each string:

>>> collections.Counter(flatten(d.values())) 
Counter({'MRS': 2, 'NTS': 1, 'PRS': 1, 'VAL': 1})
Community
  • 1
  • 1
dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    [Link to flatten recipe using hasattr.](http://code.activestate.com/recipes/577255-flatten-a-list-or-list-of-lists-etc/) Are there any advantages to checking whether it's a list or tuple instead of just iterable? The most common thing that your version would exclude would be `set`, I guess. – Cody Piersall Nov 24 '15 at 18:15
4

As already posted you can possibly use collections.Counter as it is an obvious approach or else you can either use itertools.groupby or a combination of itertools.groupby and collections.Counter

  1. Just itertools.groupby

    >>> from itertools import groupby
    >>> a, b = [list(g) for _,  g in groupby(d.values(), type)]
    >>> {k: len(list(g)) for k, g in groupby(sorted(a[0] + b))}
    {'NTS': 1, 'VAL': 1, 'PRS': 1, 'MRS': 2}
    
  2. itertools.groupby and collections.Counter

    >>> from itertools import groupby
    >>> a, b = [list(g) for _,  g in groupby(d.values(), type)]
    >>> dict(Counter(a[0] + b))
    {'NTS': 1, 'VAL': 1, 'PRS': 1, 'MRS': 2}
    

This Just does the Job for the problem OP has though not robust.

Abhijit
  • 62,056
  • 18
  • 131
  • 204
1

In general, you can use a Counter to map keys to counts - it's essentially a multiset.

Since your dict is multi-dimensional you'll have to do a little transforming, but if you simply iterate over every value and sub-value in your dict and add it to a Counter instance, you'll get what you want.

Here's a first-pass implementation; depending on exactly what d will contain you may have to tweak it a bit:

counts = Counter()
for elem in d.values():
  if isinstance(obj, Iterable) and not isinstance(elem, types.StringTypes):
    for sub_elem in elem:
      counter.add(sub_elem)
  else:
    counter.add(elem)

Notice that we check if elem is an iterable and not a string. Python doesn't make distinguishing between strings and collections easy, so if you know d will contain only strings and lists (for instance) you can simply do isinstance(elem, list) and so on. If you can't guarantee the values of d will all be lists (or tuples, or so on) it's better to explicitly exclude strings.

Also, if d could contain recursive keys (e.g. a list containing lists containing strings) this won't be sufficient; you'll likely want to write a recursive function to flatten everything, like dawg's solution.

Community
  • 1
  • 1
dimo414
  • 47,227
  • 18
  • 148
  • 244
1

I am lazy, so I am going to use library functions to get the job done for me:

import itertools
import collections

d = {"a": ["MRS", "VAL"], "b": "PRS", "c": "MRS", "d": "NTS"}
values = [[x] if isinstance(x, basestring) else x for x in d.values()]
counter = collections.Counter(itertools.chain.from_iterable(values))
print counter
print counter['MRS']  # Sampling

Output:

Counter({'MRS': 2, 'NTS': 1, 'PRS': 1, 'VAL': 1})
2

At the end, counter acts like the dictionary you want.

Explanation

Consider this line:

values = [[x] if isinstance(x, basestring) else x for x in d.values()]

Here, I turned every value in the dictionary d into a list to make processing easier. values might look something like the following (order might be different, which is fine):

# values = [['MRS', 'VAL'], ['MRS'], ['PRS'], ['NTS']]

Next, the expression:

itertools.chain.from_iterable(values)

returns a generator which flatten the list, conceptually, the list now looks like this:

['MRS', 'VAL', 'MRS', 'PRS', 'NTS']

Finally, the Counter class takes that list and count, so we ended up with the final result.

Hai Vu
  • 37,849
  • 11
  • 66
  • 93
0

You can do it, with just built-in function, this way:

>>> d = {"a":["MRS","VAL"],"b":"PRS","c":"MRS","d":"NTS"}
>>> 
>>> flat = []
>>> for elem in d.values():
    if isinstance(elem, list):
        for sub_elem in elem:
            flat.append(sub_elem)
    else:
        flat.append(elem)


>>> flat
['MRS', 'VAL', 'MRS', 'PRS', 'NTS']
>>> 
>>> output = {}
>>> 
>>> for item in flat:
    output[item] = flat.count(item)
>>>
>>> output
{'NTS': 1, 'PRS': 1, 'VAL': 1, 'MRS': 2}
Iron Fist
  • 10,739
  • 2
  • 18
  • 34