Using dplyr to gather dummy variables

Question

I have a dataset that consists of 5 dummy variables that looks like this....

> head(type)
  convertible coupe hatchback sedan wagon
1           0     0         0     1     0
2           0     1         0     0     0
3           1     0         0     0     0
4           1     0         0     0     0
5           1     0         0     0     0
6           1     0         0     0     0

If I were to use dplyr code, how can I create a new variable that is called "TypeOfCar" with all of the dummy variables collapsed into it? Thanks!

Edit: Sorry for the ambiguity. Using the information above, I was wondering if there was a way in dplyr to gather up the current set of dummy variables to make ONE variable called TypeOfCar. Example below (respective to the ID's above 1-6)

    TypeOfCar
1     sedan
2     coupe
3     convertible
4     convertible
5     convertible
6     convertible

Try `type$TypeOfCar <- names(type)[max.col(type)]`. You don't need dplyr, there's no grouping. — Rich Scriven, Feb 04 '17 at 05:51
Is there also an example, when you want exact the same.. but without for example the column "sedan"? (so how can we exclude variables to gather?) — R overflow, Oct 25 '18 at 13:52
Does this answer your question? [Simplest creation of factor variable from dummies](https://stackoverflow.com/questions/41670314/simplest-creation-of-factor-variable-from-dummies) — mkrasmus, May 14 '21 at 06:19

akrun · Answer 1 · 2017-02-04T08:25:47.847

3

We can use base R

data.frame(TypeOfCar = names(type)[as.matrix(type)%*%seq_along(type)], 
                 stringsAsFactors=FALSE)
#    TypeOfCar
#1       sedan
#2       coupe
#3 convertible
#4 convertible
#5 convertible
#6 convertible

edited Feb 04 '17 at 08:25

answered Feb 04 '17 at 05:01

akrun

874,273
37
540
662

1

Not selected but I would choose this over the selected answer given that it retains the original row order. However, dplyr does read easier.. – mkrasmus Jun 21 '19 at 12:59

score 3 · Accepted Answer · answered Feb 04 '17 at 10:40

3

This can be done using the 'tidyverse' library - specificially 'tidyr' and 'dplyr'. The following code produces the output you are after.

library(tidyverse)
type %>% gather(TypeOfCar, Count) %>% filter(Count >= 1) %>% select(TypeOfCar)

Output:

   TypeOfCar
    <chr>
1 convertible
2 convertible
3 convertible
4 convertible
5       coupe
6       sedan

Hopefully this solves your problem, let me know if any changes are needed! Thanks.

answered Feb 04 '17 at 10:40

George

674
2
7
19

If this helped to answer your question it would be greatly appreciated if you could mark the question as answered. Thanks! :) – George Feb 04 '17 at 11:36
Hey George, how would I be able to put it back into the dataset in order? It seems when I try to cbind back to the original dataframe, the data is scattered. – G. Nguyen Feb 04 '17 at 19:17
Hi :) This does that: library(tidyverse) D %>% mutate(ID = 1:nrow(D)) %>% gather(TypeOfCar, Count, -ID) %>% filter(Count >= 1) %>% arrange(ID) %>% select(TypeOfCar) – George Feb 04 '17 at 19:48
No problem! Glad I could help :) – George Feb 04 '17 at 20:05

Using dplyr to gather dummy variables

2 Answers2

Linked

Related