I have a list with ~1000 entries with the following structure (small example):
example <- list(
"1" =c("car","house"),
"2" = c("family","work","car"),
"3" = c("house","Work","car"),
"4" = "school",
"5" = c("Car","school"))
Most entries in the list contain only 1 string. Some contain 2, 3, 4, 5 or even more strings. I don't know the maximum of strings since I don't know how to get this information without scrolling through all ~1000 rows of the data.
I want to get a summary of the strings in my list. I want to know:
- How many different strings there are (e.g. 5 in the small example)
- How often the different strings occur (e.g. family:1, work:2, .... in the small example). I would like to visualize this in a plot later.
- I don't want the analysis to be non-case-sensitive (e.g. family and Family should be treated the same)
- I want to exclude duplicates (e.g. if one entry contains c("family","car","family"), family should be counted only 1 time)