1

I have a dataframe with lots of variables and would like to reformat some of the variables as factors with more numerical values corresponding to the same (ordered) factor level. See the following example:

mydf <- data.frame(replicate(3,sample(0:2,10,replace=TRUE)))
mydf[6, ] <- c(NA, NA, 2)
names(mydf) <- c("med", "fed", "id")

mydf
   med fed id
1    2   2  1
2    0   0  0
3    0   1  0
4    0   1  2
5    1   0  2
6   NA  NA  2
7    0   1  2
8    0   2  0
9    0   0  2
10   2   2  2

I would like to reformat the variables med and fed as factors with two levels: 0 - "foo", 1,2 - "bar" where "foo" < "bar". I know I can do it separately for the variables as hinted here:

mydf$med <- `levels<-`(factor(mydf$med, ordered=TRUE), list("foo"=0, "bar"=c(1,2)))
mydf$fed <- `levels<-`(factor(mydf$fed, ordered=TRUE), list("foo"=0, "bar"=c(1,2)))

mydf
    med  fed id
1   bar  bar  1
2   foo  foo  0
3   foo  bar  0
4   foo  bar  2
5   bar  foo  2
6  <NA> <NA>  2
7   foo  bar  2
8   foo  bar  0
9   foo  foo  2
10  bar  bar  2 

table(mydf$med)
foo bar 
  6   3 

Is there a way to do it at once for each variable that matches some pattern? E.g. using dplyr somewhat like

mydf %>% mutate_each(funs(???), matches("ed$"))
Community
  • 1
  • 1
janosdivenyi
  • 3,136
  • 2
  • 24
  • 36

1 Answers1

3

As David Arenburg pointed out, the solution is kind of obvious, using dplyr and the hint for the one-variable case (with the simpler, multi-line formulation).

library(dplyr)
myfunc <- function(x) {
    x <- factor(x, ordered=TRUE)
    levels(x) <- list("foo"=0, "bar"=c(1,2))
    x
}

mydf <- mydf %>% mutate_each(funs(myfunc), matches("ed$"))
mydf
    med  fed id
1   bar  bar  1
2   foo  foo  0
3   foo  bar  0
4   foo  bar  2
5   bar  foo  2
6  <NA> <NA>  2
7   foo  bar  2
8   foo  bar  0
9   foo  foo  2
10  bar  bar  2 

table(mydf$med)
foo bar 
  6   3 
janosdivenyi
  • 3,136
  • 2
  • 24
  • 36