0

Using RStudio I have 120 categories but I want to bring them to 9 + Other . Other will be an aggregation of all the rest.

There is my code:

for (i in 1:nrow(kickstarter.df)) {   
  if (kickstarter.df[i, "main_category"] == "Film & Video"||
  kickstarter.df[i, "main_category"] == "Music"||
  kickstarter.df[i, "main_category"] == "Publishing"||
  kickstarter.df[i, "main_category"] == "Games"||
  kickstarter.df[i, "main_category"] == "Technology"||
  kickstarter.df[i, "main_category"] == "Art"||
  kickstarter.df[i, "main_category"] == "Design"||
  kickstarter.df[i, "main_category"] == "Food"||
  kickstarter.df[i, "main_category"] == "Fashion") {
  kickstarter.df[i, "main_category"] <- kickstarter.df[i, "main_category"]
  }    
if  (kickstarter.df[i, "main_category"] != "Film & Video"||
  kickstarter.df[i, "main_category"] != "Music"||
  kickstarter.df[i, "main_category"] != "Publishing"||
  kickstarter.df[i, "main_category"] != "Games"||
  kickstarter.df[i, "main_category"] != "Technology"||
  kickstarter.df[i, "main_category"] != "Art"||
  kickstarter.df[i, "main_category"] != "Design"||
  kickstarter.df[i, "main_category"] != "Food"||
  kickstarter.df[i, "main_category"] != "Fashion") {
  kickstarter.df[i, "main_category"] <- "Other"
  }
}

The result is that all of them are erased "NA".

pogibas
  • 27,303
  • 19
  • 84
  • 117
MtlDataBoy
  • 15
  • 4
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Note that RStudio is a just a fancy editor that seems unrelated to your question; you are actually just writing R code. – MrFlick Apr 05 '18 at 15:57
  • 1
    Likely helpful: http://forcats.tidyverse.org/reference/fct_collapse.html – MrFlick Apr 05 '18 at 15:58
  • Just try `kickstarter.df$main_category[!kickstarter.df$main_category %in% tokeep]<-"Other"`, where `tokeep` is a vector with the categories you want to keep: `c("Film & Video","Music",...)`. – nicola Apr 05 '18 at 16:19

1 Answers1

0

Let A represent the ones you want to maintain then you can do:

A=c("Film & Video","Music","Publishing","Games","Technology", "Art","Design","Food",
    "Fashion") 

kickstarter.df[ !kickstarter.df[,"main_category"]%in%A, "main_category"] <-"Other"

or

replace(kickstarter.df[, "main_category"],! kickstarter.df[, "main_category"] %in% A,"other")
Onyambu
  • 67,392
  • 3
  • 24
  • 53
  • After trying both of your code a mistake appears : Warning message: In `[<-.factor`(`*tmp*`, list, value = "other") : invalid factor level, NA generated – MtlDataBoy Apr 07 '18 at 18:11