I have a column of data that is a factor with levels A, B and C, I am interested in combining two of these levels into one factor, so it would become A and B, with B = B and C, or maybe a new variable A and D, with D = B and C. I can come up with plenty of ways to do this by looping through the column with if statements, but I feel like there should be a more elegant approach and I was wondering if someone could point me in the right direction.
Asked
Active
Viewed 8.2k times
54
-
5More recent, better answer: http://stackoverflow.com/questions/19410108/cleaning-up-factor-levels-collapsing-multiple-levels-labels – Jack Tanner Apr 07 '15 at 19:26
3 Answers
93
Use levels(x) <- ...
to specify new levels, and to combine some previous levels. For example:
f <- factor(LETTERS[c(1:3, 3:1)])
f
[1] A B C C B A
Levels: A B C
Now combine "A" and "B" into a single level:
levels(f) <- c("A", "A", "C")
f
[1] A A C C A A
Levels: A C

Andrie
- 176,377
- 47
- 447
- 496
-
-
3
-
1What if I have say 100 levels (0,1,2,3.... 99) and I want to get only 3 (0, 1, biggerthan1) ? – skan Nov 29 '16 at 16:57
-
@Hatshepsut use `levels<-`; see https://stackoverflow.com/questions/10449366/levels-what-sorcery-is-this/10491881#10491881 – Frank Sep 17 '19 at 13:11
23
If you're using dplyr
pipes you can use the forcats
package.
library(forcats)
f %>% fct_collapse(A = c("A","B"))
#[1] A A C C A A
#Levels: A C

Joe
- 8,073
- 1
- 52
- 58
4
The rockchalk library is able to combine levels. I think its great, if you want to combine B and C together in a factor do this:
library(rockchalk)
combineLevels(mydf$facVar,levs = c("B", "C"), newLabel = c("BandC") )

debo
- 372
- 2
- 11