1

I have a counter objects like

Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

I want to reassign each element's value in increasing order by existing value, like

Counter({'the': 0, 'to': 1, 'of': 2, 'independence': 3, 'puigdemont': 4, 'mr': 5, 'a': 6, 'spain': 7, 'for': 8})

Is there any possible method?

Thanks in advance.

Update:

(My English is not very good, so you may skip my explanation and roll down to see the example below.) Sorry, it seems that I didn't make my question clear. Actually the whole Counter object is much longer. The object is obtained from a paragraph, and the value of each word is the occurrence in that paragraph. I want to build a dictionary to replace the words in my paragraph with the corresponding values in the dictionary. The values in the dictionary are ordered by the frequencies of the words in my paragraph, and if two words have the same occurrence, then in alphabetical order.

Example:

string=“where there is smoke there is fire" Occurrences for each word in string: where=1, there=2, is=2, smoke=1, fire=1. So I need a dictionary like:

{“is”: 0, “there”: 1, ”fire”:2 , “smoke”: 3, “where”:4}

The most frequent words are "is" and "there", but in alphabetical order, "i" is in front of "t", so "is" is 0 and "there" is 1.

Is there any good method to accomplish this?

Very thankful!!

nimsbnims
  • 13
  • 3
  • 2
    What's stopping you? – jonatan Nov 10 '17 at 12:03
  • 3
    A `Counter` is not ordered, so your suggestion makes little sense, you could make an ordered counter though https://stackoverflow.com/questions/35446015/creating-an-ordered-counter – Chris_Rands Nov 10 '17 at 12:04
  • 1
    How are you going to use the reassigned counter later? – Alexey Nov 10 '17 at 12:21
  • @jonatan Uh, I'm new to python, I've googled my problem and got nothing useful... – nimsbnims Nov 10 '17 at 13:31
  • @Chris_Rands Thanks, I'll check it. – nimsbnims Nov 10 '17 at 13:31
  • @Alexey I'm gonna use it to replace the words in a string with the values. – nimsbnims Nov 10 '17 at 13:33
  • @nimsbnims updating values in a dict is very straight-forward so it looks like you have more requirements than you're telling us. That, or you're asking how to update a dictionary value. So it's really difficult to figure out what you want, it becomes a guessing game. It's not a well-defined question. Is all you want to assign a unique value to each key? That also seems very straight-forward, what did you try? – jonatan Nov 10 '17 at 13:56
  • @jonatan yes...I've update my question, hoping I'm not confusing you again. – nimsbnims Nov 10 '17 at 14:04

4 Answers4

0

You would need an OrderedDict:

from collections import Counter, OrderedDict

data_dict = OrderedDict({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
c1 = Counter(dict(zip(data_dict.keys(), range(len(data_dict)))))
print(c2)

Output:

Counter({'for': 8, 'spain': 7, 'a': 6, 'mr': 5, 'puigdemont': 4, 'independence': 3, 'of': 2, 'to': 1, 'the': 0})

Here you have a live example

Netwave
  • 40,134
  • 6
  • 50
  • 93
  • Thank you for answering me. However, it doesn't work for me... I've update my question for more detailed description. – nimsbnims Nov 10 '17 at 14:03
0

Access each key and change its value:

from collections import Counter

a_dict = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

n = 0
for d in a_dict:    
    a_dict[d] = n
    n += 1

>>> a_dict
Counter({'for': 8, 'spain': 7, 'a': 6, 'mr': 5, 'puigdemont': 4, 'independence': 3, 'of': 2, 'to': 1, 'the': 0})

If you could go with an ordered list of tuples:

>>> sorted(a_dict.items(), key=lambda x: x[1])
[('the', 0), ('to', 1), ('of', 2), ('independence', 3), ('puigdemont', 4), ('mr', 5), ('a', 6), ('spain', 7), ('for', 8)]
srikavineehari
  • 2,502
  • 1
  • 11
  • 21
  • Thank you for answering me. However, it doesn't work for me... I've update my question for more detailed description. – nimsbnims Nov 10 '17 at 14:03
0

As I understand from your comment, you don't need sorted counter, so

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})

for i, k in enumerate(c.most_common()):
    c[k[0]] = i

Result:

Counter({'spain': 8, 'for': 7, 'a': 6, 'puigdemont': 5, 'independence': 4, 'mr': 3, 'of': 2, 'the': 1, 'to': 0})

Update:

m = c.most_common()
res = {k[0]: i for i, k in enumerate(sorted(m, key=lambda x: (-x[1], x[0])))}

Result:

{'a': 6, 'spain': 8, 'of': 4, 'mr': 3, 'the': 0, 'for': 7, 'to': 1, 'independence': 2, 'puigdemont': 5}
Alexey
  • 409
  • 2
  • 8
  • Thank you for answering me. However, it doesn't work for me... I've update my question for more detailed description. – nimsbnims Nov 10 '17 at 14:03
0

To sort your words by frequency and then alphabetic order, and then create a dictionary from that which assigns a unique key to each word:

from collections import Counter

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
res = {word: unique_id for unique_id, (_, word) in enumerate(
    sorted([(-freq, word) for word, freq in c.most_common()]))
}

print(res)

output:

{'the': 0, 'to': 1, 'independence': 2, 'mr': 3, 'of': 4, 'puigdemont': 5, 'a': 6, 'for': 7, 'spain': 8}

Note that the result is a dict and therefore not necessarily ordered. (In cpython 3.6 it'll come out ordered, but that's an implementation detail that should not be relied on.)

The innermost comprehension is used to create tuples of (-freq, word) which will produce the desired sort order. The outer comprehension discards the frequency (unpacks the key-value and only keeps the word) and uses enumerate to generate unique id's

Edit: if order is desired in the output, instead use:

from collections import Counter, OrderedDict

c = Counter({'the': 10, 'to': 10, 'of': 5, 'independence': 5, 'puigdemont': 5, 'mr': 5, 'a': 4, 'spain': 4, 'for': 4})
res = OrderedDict((word, unique_id) for unique_id, (_, word) in enumerate(
    sorted([(-freq, word) for word, freq in c.most_common()]))
)

print(res)
jonatan
  • 135
  • 1
  • 7