-1

I'm trying to create a grouping variable with values 0 and 1 by splitting the columns.

For example Var1 would take the value 0 and Var2 would take the value 1.

Var1 <- c(12, 12.3, 14.1, 6.2, 2.9, 5, 16.2, 2.3, 4.8, 5.9, 15, 12, 11.1)

Var2 <- c(11.2, 15.1, 16, 7.2, 3.1, 1.2, 5.2, 4.1, 3.1, 11.6, 2.1, 6.5, 9.1)

data <- data.frame(Var1, Var2)

Looking to do this using a base R way and a tidyverse way

writer_typer
  • 708
  • 7
  • 25
  • Can you show the expected as the conditoins are not clear – akrun Aug 25 '20 at 22:51
  • I would like to create a new column, say ```group``` which will have 0 for all ```Var1``` information and 1 for all ```Var2``` information. Something like changing it from a wide to a long format, or making it tidy data. – writer_typer Aug 25 '20 at 22:54

2 Answers2

1

I would suggest reshaping the data as next and then create the new variable with the conditions you mentioned:

library(tidyverse)
#Data
Var1 <- c(12, 12.3, 14.1, 6.2, 2.9, 5, 16.2, 2.3, 4.8, 5.9, 15, 12, 11.1)
Var2 <- c(11.2, 15.1, 16, 7.2, 3.1, 1.2, 5.2, 4.1, 3.1, 11.6, 2.1, 6.5, 9.1)
data <- data.frame(Var1, Var2)
#Reshape
data %>% pivot_longer(cols = c(Var1,Var2)) %>%
  mutate(Var=ifelse(name=='Var1',0,ifelse(name=='Var2',1,NA)))

Output:

# A tibble: 26 x 3
   name  value   Var
   <chr> <dbl> <dbl>
 1 Var1   12       0
 2 Var2   11.2     1
 3 Var1   12.3     0
 4 Var2   15.1     1
 5 Var1   14.1     0
 6 Var2   16       1
 7 Var1    6.2     0
 8 Var2    7.2     1
 9 Var1    2.9     0
10 Var2    3.1     1
# ... with 16 more rows
Duck
  • 39,058
  • 13
  • 42
  • 84
1

As we have only two columns, we can directly coerce to binary after creating a logical vector

library(dplyr)
library(tidyr)
data %>%
       pivot_longer(everything()) %>% 
       mutate(grp = +(name == 'Var2'))
# A tibble: 26 x 3
#   name  value   grp
#   <chr> <dbl> <int>
# 1 Var1   12       0
# 2 Var2   11.2     1
# 3 Var1   12.3     0
# 4 Var2   15.1     1
# 5 Var1   14.1     0
# 6 Var2   16       1
# 7 Var1    6.2     0
# 8 Var2    7.2     1
# 9 Var1    2.9     0
#10 Var2    3.1     1
# … with 16 more rows

We could also do this without any reshaping in base R

+(names(data) == 'Var2')[col(data)]
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
akrun
  • 874,273
  • 37
  • 540
  • 662