-1

I have a variable (distributor, format = factor) within a data.frame of movies. I want to replace the name of all distributors that are present less than 10 times to 'Small Companies'. I am able to come up with a list and count using

aggregate(data.frame(count = distributor), list(value = distributor), length)

but I am unable to replace within my data.frame.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Zafer
  • 1
  • 1
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Oct 23 '18 at 18:22
  • For simplicity, we will assume we are trying to change the name of distributors that show up less than 4 times. The distributors column looks as follows: Movies$distributors = c(A,A,A,A,B,B,B,B,C,D) We want it to look like such: Movies$distributor = c(A,A,A,A,B,B,B,B,Small Company, Small Company) Essentially, we want to replace the C and D with "Small Company" – Zafer Oct 23 '18 at 18:26

1 Answers1

1

Here is a solution using dplyr.

library(dplyr)

## make some dummy data
df <- tribble(
     ~distributor, ~something,
     "dist1", 89,
     "dist2", 92,
     "dist3", 29,
     "dist1", 89
)


df %>% 
     group_by(distributor) %>% 
     ## this counts the number of occurences of each distributor
     mutate(occurrences = n()) %>% 
     ungroup() %>% 
     ## change the name of the distributor if the occurrences are less than 2
     mutate(distributor = ifelse(occurrences < 2, "small company", distributor))
TBT8
  • 766
  • 1
  • 6
  • 10
  • Hi, thank you for your answer. When we try this code we keep getting the following error: Error in mutate_impl(.data, dots) : Column `distributor` can't be modified because it's a grouping variable – Zafer Oct 23 '18 at 18:49
  • Did you include the `ungroup()` line? That should clear the grouping variables. – TBT8 Oct 23 '18 at 20:21