I have a dataframe df
with only one variable var
with some related values.
df <- data.frame(var = c(rep('AUS',12), rep('NZ',12), rep('ENG',7), rep('SOC',12),
rep('PAK',11), rep('SRI',17), rep('IND',15)))
df %>% count(var)
# # A tibble: 7 x 2
# var n
# <fctr> <int>
# 1 AUS 12
# 2 ENG 7
# 3 IND 15
# 4 NZ 12
# 5 PAK 11
# 6 SOC 12
# 7 SRI 17
Based on some relations, some values should be recoded with a new value.
df %>% mutate(var = recode(var, 'AUS' = 'A', 'NZ' = 'A', 'ENG' = 'A',
'SOC' = 'A', 'PAK' = 'B', 'SRI' = 'B')) %>% count(var)
# A tibble: 3 x 2
# var n
# <fctr> <int>
# 1 A 43
# 2 IND 15
# 3 B 28
It can be seen that A
and B
recodes for 4 and 2 values respectively. I have also the expected solution in the question. However, is there any other efficient way to do this, instead of specifying the relations same number of times(4,2)??