0

The data I used look like this:

data
Subject    Cluster
A          1
B          1
C          2
D          3
E          2
F          1
G          3
H          3
I          4
J          4
K          5
L          6
M          7
N          5
O          3

Based on column cluster, I want to make new column called Verdict that contain note if the subject is passed, need remedial, or failed.

If the subjects is in:

*Cluster 1 or 3, they failed

*Cluster 2 or 5, they need remedial

*Cluster 4 or 6 or 7, they passed

And the final data will look like this

data
subject    cluster    verdict
A          1          Failed
B          1          Failed
C          2          Remedial
D          3          Failed
E          2          Remedial
F          1          Failed
G          3          Failed
H          3          Failed
I          4          Passed
J          4          Passed
K          5          Remedial
L          6          Passed
M          7          Passed
N          5          Remedial
O          3          Failed

I already tried using simple code like:

data$verdict = 
  ifelse(data$cluster == 1|data$cluster == 3,'Failed',
         ifelse(data$cluster == 2|data$cluster == 5,'Remedial','Passed'))

And it worked. But I feel it's not efficient especially if I have large number of cluster and/or verdict. Is there more efficient way to do this?

undernoob
  • 39
  • 4

1 Answers1

2

Instead of using == try %in% and try using case_when.

data %>%
  mutate(verdict = case_when(
    cluster %in% c(1,3) ~ "Failed",
    cluster %in% c(2,5) ~ "Remedial",
    TRUE ~ "Passed"
  ))
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
Park
  • 14,771
  • 6
  • 10
  • 29