I am new to coding and could use help. Here is my task: I have a csv of online marketing image titles. It is a single column. Each cell in this column holds the marketing image title text for each ad. It is just a string of words. For instance cell A1 reads: "16 Maddening Tire Fails" and etc etc. To load csv I do:
with open('usethis.csv', 'rb') as f:
mycsv = csv.reader(f)
mycsv = list(mycsv)
I initialize a list:
mylist = []
my desire is to take the text in each cell and extract the bigrams. I do that as follows:
for i, c in enumerate(mycsv):
mylist.append(list(nltk.bigrams(word_tokenize(' '.join(c)))))
mylist then looks like this, but with more data:
[[('16', 'Maddening'), ('Maddening', 'Tire'), ('Tire', 'Fails')], [('16', 'Maddening'), ('Maddening', 'Tire'), ('Tire', 'Fails'), ('Fails', 'That'), ('That', 'Show'), ('Show', 'What'), ('What', 'True'), ('True', 'Negligence'), ('Negligence', 'Looks'), ('Looks', 'Like')]
mylist holds individual lists which are the bigrams created from each cell in my csv.
Now I am wanting to loop through every bigram in all lists and next to each bigram print the number of times it appears in another list (cell). This would be the same as a countifs in excel, basically. For instance, if the bigram "('16', 'Maddening')" in the first list (cell A1) appears 3 other times in (mylist) then print the number 3 next to it. And so on for each bigram. If it is easier to return this information into a new list that's fine. Just printing it out somewhere that makes sense.
I have done a lot of reading online, for instance this link kind of was along the general idea: How to check if all elements of a list matches a condition?
And also this link about dictionaries was similar in that it is returning a number next to each value as I want to return a count next to each bigram.. What are Python dictionary view objects?....
But I really am at a loss as to how to do this. Thank you so much in advance for your help! Let me know if I need to explain something better.