Relative Frequency in Python

Question

Is it possible to calculate the relative frequency of elements occurring in a list in Python?

For example:

['apple', 'banana', 'apple', 'orange'] # apple for example would be 0.5

Possible duplicate of http://stackoverflow.com/questions/2600191/how-can-i-count-the-occurrences-of-a-list-item-in-python — Daniel, Mar 21 '15 at 04:00
@Alpine, this really sounds like you are asking for us to do your homework. This program is not too difficult. You will want to check the length of the list and you will want to use dictionaries. — skyler, Mar 21 '15 at 04:01

score 9 · Accepted Answer · answered Mar 21 '15 at 04:09

9

You can use NLTK for this:

import ntlk
text = ['apple', 'banana', 'apple', 'orange']
fd = nltk.FreqDist(text)

Check out the tutorial in the book the how to and the source code

Alternately, you could use a Counter:

from collections import Counter
text = ['apple', 'banana', 'apple', 'orange']
c = Counter(text)

answered Mar 21 '15 at 04:09

craighagerman

373
2
8

1

Isn't NLTK overkill for this? – matsjoyce Mar 24 '15 at 18:12
Is NLTK overkill? Depends. If you have NLTK installed already it has the 'batteries included' to calculate Frequency distributions and print out stats (most_common etc) which I find very useful. I do a lot of NLP work and find NLTK very useful. It is hardly overkill for me - just a useful tool for a particular job. But if you aren't doing any NLP work and are just doing a one-off frequency distribution, then it is overkill. That is why I gave two options. – craighagerman Mar 25 '15 at 18:46
2

Thats not the relative frequency. Its just the counts. The relative frequency would have been {apple : 0.5, banana : 0.25, orange : 0.25} – Isbister Mar 06 '17 at 13:41
See below for an answer without third party requirements: https://stackoverflow.com/a/58412985. This is a question, that is not (at all) specific to NLP, so 1) the majority of people having a similar issue won't have this issue in the context of NLP and 2) even in that case it should not be assumed that people have nltk installed, due to the variety of NLP frameworks out there. That's regarding the first answer; the second part does not solve the question asked, since it returns absolute frequencies, whereas the question asked for relative frequencies. – pedjjj Apr 19 '20 at 11:31

score 4 · Answer 2 · answered Oct 16 '19 at 12:09

The following snippet does exactly what the question asks for: given a Counter() object, return a dict that contains the same keys but with relative frequencies as values. No third party library required.

def counter_to_relative(counter):
    total_count = sum(counter.values())
    relative = {}
    for key in counter:
        relative[key] = counter[key] / total_count
    return relative

Ram · Answer 3 · 2020-06-03T14:52:55.127

3

This simple code will do the job, returns a list of tuples but you can adapt it easily.

lst = ['apple', 'banana', 'apple', 'orange']
counts = [(word, lst.count(word) / len(lst)) for word in set(lst)]

It will return the relative frequencies of each word as below:

[('orange', 0.25), ('banana', 0.25), ('apple', 0.5)]

Note that :

iterate over set(lst) to avoid duplicates
divide the lst.count by len(lst) to get relative frequencies

edited Jun 03 '20 at 14:52

answered May 31 '20 at 14:34

Ram

31
2

Welcome to SO! Could you add sample output to make it clearer what the code does? – xjcl May 31 '20 at 15:18

score 2 · Answer 4 · answered Mar 21 '15 at 04:19

You can do this pretty easily by just counting the number of times the element occurs in the list.

def relative_frequency(lst, element):
    return lst.count(element) / float(len(lst))

words = ['apple', 'banana', 'apple', 'orange']
print(relative_frequency(words, 'apple'))

score 0 · Answer 5 · answered Mar 21 '15 at 04:00

0

Make a dictionary with words as keys, and times of occurence as values. After you have this dictionary you can divide each value by length of list of words.

answered Mar 21 '15 at 04:00

justanothercoder

1,830
1
16
27

Relative Frequency in Python

5 Answers5