Ok, so here is my scenario: I have a dataset with a column composed of lists of words (keyword tags for YT videos, where each row is video data).
What I want to do is do a complete count of all unique object instances within these lists, for the entire column. So basically what I want in the end is a table with two fields: keyword, count.
If I just do a simple dplyr query, then it counts the list itself as a unique object. While this is also interesting, this is not what I want.
So this is the above dplyr query that I want to utilize further, but not sure how to nest unique instances within the unique lists:
vid_tag_freq = df %>%
count(tags)
To further clarify:
With a dataset like:
Tags
1 ['Dog', 'Cat', 'Mouse', 'Fish']
2 ['Cat', 'Fish']
3 ['Cat', 'Fish']
I am now getting:
Tags Count
1 ['Dog', 'Cat', 'Mouse', 'Fish'] 1
2 ['Cat', 'Fish'] 2
What I actually want:
Tags Count
1 'Cat' 3
2 'Fish' 3
3 'Dog' 1
4 'Mouse' 1
I hope that explains it lol
EDIT: This is what my data looks like, guess most are lists of lists? Maybe I should clean up [0]s as null?
[1] "[['Flood (Disaster Type)', 'Burlington (City/Town/Village)', 'Ontario (City/Town/Village)']]"
[2] "[0]"
[3] "[0]"
[4] "[['Rocket (Product Category)', 'Interview (TV Genre)', 'Canadian Broadcasting Corporation (TV Network)', 'Israel (Country)', 'Gaza War (Military Conflict)']]"
[5] "[0]"
[6] "[['Iraq (Country)', 'Military (Film Genre)', 'United States Of America (Country)']]"
[7] "[['Ebola (Disease Or Medical Condition)', 'Chair', 'Margaret Chan (Physician)', 'WHO']]"
[8] "[['CBC Television (TV Network)', 'CBC News (Website Owner)', 'Canadian Broadcasting Corporation (TV Network)']]"
[9] "[['Rob Ford (Politician)', 'the fifth estate', 'CBC Television (TV Network)', 'Bill Blair', 'Gillian Findlay', 'Documentary (TV Genre)']]"
[10] "[['B.C.', 'Dog Walking (Profession)', 'dogs', 'dog walker', 'death', 'dead']]"
[11] "[['Suicide Of Amanda Todd (Event)', 'Amanda Todd', 'cyberbullying', 'CBC Television (TV Network)', 'the fifth estate', 'Mark Kelley', 'cappers', 'Documentary (TV Genre)']]"
[12] "[['National Hockey League (Sports Association)', 'Climate Change (Website Category)', 'Hockey (Sport)', 'greenhouse gas', 'emissions']]"
[13] "[['Rob Ford (Politician)', 'bomb threat', 'Toronto (City/Town/Village)', 'City Hall (Building)']]"
[14] "[['Blue Jays', 'Ashes', 'friends']]"
[15] "[['Robin Williams (Celebrity)', 'Peter Gzowski']]"