28

I have a data frame with a continuous numeric variable, age in months (age_mnths). I want to make a new discrete variable, with age categories based on age intervals.

# Some example data
rota2 <- data.frame(age_mnth = 1:170)

I've created ifelse based procedure (below), but I believe there is a possibility for more elegant solution.

rota2$age_gr<-ifelse(rota2$age_mnth < 6, rr2 <- "0-5 mnths",

   ifelse(rota2$age_mnth > 5 & rota2$age_mnth < 12, rr2 <- "6-11 mnths",

          ifelse(rota2$age_mnth > 11 & rota2$age_mnth < 24, rr2 <- "12-23 mnths",

                 ifelse(rota2$age_mnth > 23 & rota2$age_mnth < 60, rr2 <- "24-59 mnths",

                        ifelse(rota2$age_mnth > 59 & rota2$age_mnth < 167, rr2 <- "5-14 yrs",

                              rr2 <- "adult")))))

I know there is cut function but I couldn't deal with it for my purpose to discretize / categorize.

Henrik
  • 65,555
  • 14
  • 143
  • 159
Aybek Khodiev
  • 596
  • 1
  • 4
  • 10
  • 1
    A basic error here is the use of the assignment operator in the values for the "yes" and "no" parameters – IRTFM Apr 12 '17 at 16:48

2 Answers2

51

If there is a reason you don't want to use cut then I don't understand why. cut will work fine for what you want to do

# Some example data
rota2 <- data.frame(age_mnth = 1:170)
# Your way of doing things to compare against
rota2$age_gr<-ifelse(rota2$age_mnth<6,rr2<-"0-5 mnths",
                     ifelse(rota2$age_mnth>5&rota2$age_mnth<12,rr2<-"6-11 mnths",
                            ifelse(rota2$age_mnth>11&rota2$age_mnth<24,rr2<-"12-23 mnths",
                                   ifelse(rota2$age_mnth>23&rota2$age_mnth<60,rr2<-"24-59 mnths",
                                          ifelse(rota2$age_mnth>59&rota2$age_mnth<167,rr2<-"5-14 yrs",
                                                 rr2<-"adult")))))

# Using cut
rota2$age_grcut <- cut(rota2$age_mnth, 
                       breaks = c(-Inf, 6, 12, 24, 60, 167, Inf), 
                       labels = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "adult"), 
                       right = FALSE)
Dason
  • 60,663
  • 9
  • 131
  • 148
21
rota2$age_gr<-c( "0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs",
                 "adult")[
           findInterval(rota2$age_mnth , c(-Inf, 5.5, 11.5, 23.5, 59.5, 166.5, Inf) ) ]
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 2
    It is slightly different than `cut` in that the intervals are closed on the left and open on the right unless specified otherwise. – IRTFM Nov 26 '12 at 06:01
  • 1
    But you can always write a version of findInterval that is closed on the right and open on the left - http://stackoverflow.com/questions/13482872/findinterval-with-right-closed-intervals – Dason Nov 26 '12 at 06:19
  • Yes, you can, and you can call cut with otehr parameters to make if behave like findInterval ... and you can also use cut2 from Hmisc which has the defaults I prefer. – IRTFM May 05 '19 at 23:50