0

I would like advise on how to code a new column based on the following dataset:

df <- data.frame(AA = c("3454","345","5","345","567","79","43","2342","231","234","232","24"),
                 BB = c(123, 345, 7567, 234, 8679, 890, 812, 435345, 567, 568, 786, 678),
                 CC = c(1, 2, 6, 8, 22, 33, 56, 2, 34, 45, 45, 65), stringsAsFactors = F)

and I would like to create a new column called 'new' made out of the following conditions:

  1. Group1 = AA > 300 & BB > 2000 & CC < 5

  2. Group2 = AA ≥ 20 & BB ≤ 700 & CC > 60 but ≤ 70

Thanks!

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
Lili
  • 547
  • 6
  • 19

2 Answers2

2

case_when() in dplyr is designed to avoid the use of nested ifelse()s.

library(dplyr)

df %>%
  mutate(new = case_when(
    AA > 300 & BB > 2000 & CC < 5 ~ "Group1",
    AA >= 20 & BB <= 700 & CC > 60 & CC <=70 ~ "Group2",
    TRUE ~ "other"
  ))
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
1

You can try this:

library(dplyr)
df %>% mutate(Group=ifelse(AA>300 & BB>2000 & CC<5,'Group1',
                       ifelse(AA>=20 & BB<=700 & (CC > 60 & CC<=70),'Group2',NA))) -> df1
Duck
  • 39,058
  • 13
  • 42
  • 84
  • ! thanks!! I had to make an edit because for group2 last condition needs to be >60 but ≤70. Could you ammend please? Thanks so much! – Lili Jul 20 '20 at 15:48
  • @Lili I have updated, Is that what you want? – Duck Jul 20 '20 at 15:50
  • this is great! Thanks! – Lili Jul 20 '20 at 15:54
  • @Lili Great! If you consider this answer was helpful you could accept it by clicking the tick in the left side of this answer :) – Duck Jul 20 '20 at 15:55
  • What happens when the number you try is negative? it gives me error (in a dataset that has negative values). AA> -300? or something like that – Lili Jul 20 '20 at 17:02
  • 1
    @Lili You should check the data, it actually will produce `NA` if conditions are not met. Of course the code you shared, you can try and see the results. – Duck Jul 20 '20 at 17:04