Convert a factor to indicator variables?

Question

How do I convert a factor in R to several indicator variables, one for each level?

http://stackoverflow.com/questions/5048638/automatically-expanding-an-r-factor-into-a-collection-of-1-0-indicator-variables/5048726#5048726 — Ben Bolker, Feb 17 '13 at 15:09

score 8 · Accepted Answer · answered Feb 17 '13 at 14:10

8

One way is to use model.matrix():

model.matrix(~Species, iris)

    (Intercept) Speciesversicolor Speciesvirginica
1             1                 0                0
2             1                 0                0
3             1                 0                0

....

148           1                 0                1
149           1                 0                1
150           1                 0                1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$Species
[1] "contr.treatment"

answered Feb 17 '13 at 14:10

Andrie

176,377
47
447
496

1

I think you have to add `-1` in the formula otherwise there will be a missing level in the resulting matrix. – juba Feb 17 '13 at 14:12
@juba That's a good point, but I think it depends on your objective. In dummy coding, you need `n-1` dummy variables to represent `n` variables. So, in the `iris$Species` example, levels of `0` and `0` means the species is `Setosa`. – Andrie Feb 17 '13 at 14:16
@Andrien you're right, it depends on the result you want to get, didn't think about this. – juba Feb 17 '13 at 14:17
@Andrie are you aware of some standard way to reverse this? I.e. get a factor variable from a given model.matrix? – Matt Bannert Oct 11 '15 at 09:35
How do you merge model.matrix output back into the original dataframe? – stackoverflowuser2010 May 21 '16 at 00:51

score 5 · Answer 2 · answered Feb 17 '13 at 14:11

There are several ways to do it, but you can use model.matrix :

color <- factor(c("red","green","red","blue"))
data.frame(model.matrix(~color-1))
#   colorblue colorgreen colorred
# 1         0          0        1
# 2         0          1        0
# 3         0          0        1
# 4         1          0        0

score 3 · Answer 3 · answered Feb 17 '13 at 14:15

If I understood your question correctly, use model.matrix command, like this.

dd <- data.frame(a = gl(3,4), b = gl(4,1,12))
model.matrix(~ a + b, dd)
   (Intercept) a2 a3 b2 b3 b4
1            1  0  0  0  0  0
2            1  0  0  1  0  0
3            1  0  0  0  1  0
4            1  0  0  0  0  1
5            1  1  0  0  0  0
6            1  1  0  1  0  0
7            1  1  0  0  1  0
8            1  1  0  0  0  1
9            1  0  1  0  0  0
10           1  0  1  1  0  0
11           1  0  1  0  1  0
12           1  0  1  0  0  1
attr(,"assign")
[1] 0 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$a
[1] "contr.treatment"

attr(,"contrasts")$b
[1] "contr.treatment"

score 2 · Answer 4 · answered Feb 17 '13 at 14:11

2

try this:

myfactors<-factor(sample(c("f1","f2","f3"),10,replace=T));
myIndicators<-diag(nlevels(myfactors))[myfactors,];

answered Feb 17 '13 at 14:11

Aditya Sihag

5,057
4
32
43

Convert a factor to indicator variables?

4 Answers4

Linked

Related