I have dataset as follow
EstablishmentName Freq
bahria university 20
bahria university islamabad 12
arid agriculture 3
arid agriculture university 15
arid rawalpindi 9
college of e&me, nust 20
college of e & me (nust) 15
college of eme 30
As you can see above that Bahria University and Bahria University Islamabad are almost same, so goes for other strings. I want to unify them into one such that
Expected Output
EstablishmentName Freq
Bahria University 32
Arid Agriculture 27
College of EME 30
I have tried the following solution but it doesn't seems to work.
library(SnowballC)
library(dplyr)
mutate(df, word = wordStem(EstablishmentName)) %>%
group_by(EstablishmentName) %>%
summarise(total = sum(Freq))