I have a variable, glyhb, that is numeric from 2.85 to 16.11. How can I turn it into a categoric variable where everything under 5.7 is a category, everything from 5.7 to 6.4 is another, and a third with everything 6.5 or higher.enter image description here
Asked
Active
Viewed 79 times
0
-
2Best dupe I can find quicky: [R - cut by defined interval](http://stackoverflow.com/q/5746544/903061), maybe someone has a better duplicate? – Gregor Thomas Apr 29 '16 at 16:19
-
2@Gregor this one is quite similar. [specify-factor-levels-for-intervals](http://stackoverflow.com/questions/21558129/specify-factor-levels-for-intervals) – lmo Apr 29 '16 at 16:57
2 Answers
4
The function cut()
divides a numerical vector into segments according to the values defined in the parameter breaks
. In this case we can include the option right=FALSE
to specify that the value 5.7 should belong to category 2 and that the value 6.5 should be assigned to category 3. The default is to include the value at the right boundary in the corresponding segment.
cut(glyhb, breaks=c(0,5.7,6.5,Inf), right=FALSE, labels=paste0("cat", c(1:3))
By default cut()
returns a vector of categorical variables. We can specify the labels of these factors (the levels) with the option labels
. In this case the levels cat1
, cat2
, and cat3
have been chosen.
Hope this helps.

RHertel
- 23,412
- 5
- 38
- 64
1
Here's an example using nested ifelse
:
set.seed(999)
glyhb <- runif(100, 2.85, 16.11)
categorical_glyhb <- factor(ifelse(glyhb >= 6.5, 3, ifelse(glyhb >= 5.7, 2, 1)))

lune
- 48
- 1
- 4