0

Let's say I have a list of words and a string. I want a new array that represents the frequency of each word in the string from the list of words. Also, each index of the word should be the same and the length of the array is the same as listWords.

listWords = ['Noodles', 'Instant', 'Flavour', 'Ramen', 'Chicken', 'Flavor', 'Spicy', 'Beef'] 

string = "Cup Noodles Chicken Vegetable Noodles" 

The resulting array should look like this:

Each index represents the frequency of each word in the list, and 0 otherwise

result = [2, 0, 0, 0, 1, 0, 0, 0] 
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
John Smith
  • 21
  • 6
  • 2
    Does this answer your question? [Count frequency of words in a list and sort by frequency](https://stackoverflow.com/questions/20510768/count-frequency-of-words-in-a-list-and-sort-by-frequency) – Stuart May 05 '20 at 21:39
  • Or [this one](https://stackoverflow.com/questions/9919604/efficiently-calculate-word-frequency-in-a-string)? – Stuart May 05 '20 at 21:54

2 Answers2

5

You can split the sentence and pass it to Collections.counter(). With that you can lookup the counts in your word list. For example:

from collections import Counter

string = "Cup Noodles Chicken Vegetable Noodles"
listWords = ['Noodles', 'Instant', 'Flavour', 'Ramen', 'Chicken', 'Flavor', 'Spicy', 'Beef']

counts = Counter(string.split())
[counts[word] for word in listWords]
# [2, 0, 0, 0, 1, 0, 0, 0]

Without Counter()

You can, of course, do this without Counter(). You just need to handle the KeyError that happens when you try to access a key for the first time. Then you can use get(word, 0) to return a default of 0 when looking up words. Something like:

string = "Cup Noodles Chicken Vegetable Noodles"
listWords = ['Noodles', 'Instant', 'Flavour', 'Ramen', 'Chicken', 'Flavor', 'Spicy', 'Beef']

counts = {}

for word in string.split():
    try:
        counts[word] += 1
    except KeyError:
        counts[word] = 1


[counts.get(word, 0) for word in listWords]
# still [2, 0, 0, 0, 1, 0, 0, 0]
Mark
  • 90,562
  • 7
  • 108
  • 148
  • It there a way to do it without using counter? I am reading data from csv file and it says Counter() not callable – John Smith May 05 '20 at 22:02
  • `collections.Counter()` is very efficient and part of the python standard library. It should be available. But I've added an edit for an alternative if you want to do it the 'hard way'. – Mark May 05 '20 at 23:22
0

Since you asked for a way without using counter, here is a piece of code that would work, not sure on its time complexity.

listWords = ['Noodles', 'Instant', 'Flavour', 'Ramen', 'Chicken', 'Flavor', 'Spicy', 'Beef']
indicies = {}
freq = [0]*len(listWords)
for i in range(len(listWords)):
    indicies[listWords[i]] = i

string = "Cup Noodles Chicken Vegetable Noodles"

for word in string.split():
    if word in indicies.keys():
        freq[indicies[word]]+=1

print(freq)
AdishRao
  • 163
  • 1
  • 1
  • 12