-2

I am trying to set up a data set that checks how often several different names are mentioned in a list of articles. So for each article, I want to know how often nameA, nameB and so forth are mentioned. However, I have troubles with iterating over the list.

My code is the following:

for element in list_of_names:
for i in list_of_articles:
    list_of_namecounts = len(re.findall(element, i))
  1. list_of_names = a string with several names [nameA nameB nameC]
  2. list_of_articles = a list with 40.000 strings that are articles

Example of article in list_of_articles:

  1. Index: 1
  2. Type: str
  3. Size: Amsterdam - de financiële ...

the error i get is: expected string or buffer

I though that when iterating over the list of strings, that the re.findall command should work using lists like this, but am also fairly new to Python. Any idea how to solve my issue here?

Thank you!

M.Metz
  • 1
  • 1
  • 2

1 Answers1

0

If your list is ['apple', 'apple', 'banana'] and you want the result: number of apple = 2, then:

from collections import Counter

list_count = Counter(list_of_articles)

for element in list_of_names:
    list_of_namecounts = list_count[element]

And assuming list_of_namecounts is a list ¿?

list_of_namecounts = []
for element in list_of_names:
    list_of_namecounts.append(list_count[element])

See this for more understanding

Community
  • 1
  • 1
EmilioK
  • 380
  • 4
  • 10