0

I have a variable, glyhb, that is numeric from 2.85 to 16.11. How can I turn it into a categoric variable where everything under 5.7 is a category, everything from 5.7 to 6.4 is another, and a third with everything 6.5 or higher.enter image description here

  • 2
    Best dupe I can find quicky: [R - cut by defined interval](http://stackoverflow.com/q/5746544/903061), maybe someone has a better duplicate? – Gregor Thomas Apr 29 '16 at 16:19
  • 2
    @Gregor this one is quite similar. [specify-factor-levels-for-intervals](http://stackoverflow.com/questions/21558129/specify-factor-levels-for-intervals) – lmo Apr 29 '16 at 16:57

2 Answers2

4

The function cut() divides a numerical vector into segments according to the values defined in the parameter breaks. In this case we can include the option right=FALSE to specify that the value 5.7 should belong to category 2 and that the value 6.5 should be assigned to category 3. The default is to include the value at the right boundary in the corresponding segment.

cut(glyhb, breaks=c(0,5.7,6.5,Inf), right=FALSE, labels=paste0("cat", c(1:3))

By default cut() returns a vector of categorical variables. We can specify the labels of these factors (the levels) with the option labels. In this case the levels cat1, cat2, and cat3 have been chosen.

Hope this helps.

RHertel
  • 23,412
  • 5
  • 38
  • 64
1

Here's an example using nested ifelse:

set.seed(999)
glyhb <- runif(100, 2.85, 16.11)
categorical_glyhb <- factor(ifelse(glyhb >= 6.5, 3, ifelse(glyhb >= 5.7, 2, 1)))
lune
  • 48
  • 1
  • 4