2

I have a dataset that consists of 5 dummy variables that looks like this....

> head(type)
  convertible coupe hatchback sedan wagon
1           0     0         0     1     0
2           0     1         0     0     0
3           1     0         0     0     0
4           1     0         0     0     0
5           1     0         0     0     0
6           1     0         0     0     0

If I were to use dplyr code, how can I create a new variable that is called "TypeOfCar" with all of the dummy variables collapsed into it? Thanks!

Edit: Sorry for the ambiguity. Using the information above, I was wondering if there was a way in dplyr to gather up the current set of dummy variables to make ONE variable called TypeOfCar. Example below (respective to the ID's above 1-6)

    TypeOfCar
1     sedan
2     coupe
3     convertible
4     convertible
5     convertible
6     convertible
user2100721
  • 3,557
  • 2
  • 20
  • 29
G. Nguyen
  • 151
  • 3
  • 14

2 Answers2

3

We can use base R

data.frame(TypeOfCar = names(type)[as.matrix(type)%*%seq_along(type)], 
                 stringsAsFactors=FALSE)
#    TypeOfCar
#1       sedan
#2       coupe
#3 convertible
#4 convertible
#5 convertible
#6 convertible
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Not selected but I would choose this over the selected answer given that it retains the original row order. However, dplyr does read easier.. – mkrasmus Jun 21 '19 at 12:59
3

This can be done using the 'tidyverse' library - specificially 'tidyr' and 'dplyr'. The following code produces the output you are after.

library(tidyverse)
type %>% gather(TypeOfCar, Count) %>% filter(Count >= 1) %>% select(TypeOfCar)

Output:

   TypeOfCar
    <chr>
1 convertible
2 convertible
3 convertible
4 convertible
5       coupe
6       sedan

Hopefully this solves your problem, let me know if any changes are needed! Thanks.

George
  • 674
  • 2
  • 7
  • 19
  • If this helped to answer your question it would be greatly appreciated if you could mark the question as answered. Thanks! :) – George Feb 04 '17 at 11:36
  • Hey George, how would I be able to put it back into the dataset in order? It seems when I try to cbind back to the original dataframe, the data is scattered. – G. Nguyen Feb 04 '17 at 19:17
  • Hi :) This does that: library(tidyverse) D %>% mutate(ID = 1:nrow(D)) %>% gather(TypeOfCar, Count, -ID) %>% filter(Count >= 1) %>% arrange(ID) %>% select(TypeOfCar) – George Feb 04 '17 at 19:48
  • No problem! Glad I could help :) – George Feb 04 '17 at 20:05