0

I have a large dataset (called 'cud1') to which I want to add a new column categorising multiple primary health complaints into more simple health categories ('q2.2_healthCat'). That is, primary health complaints 1, 2, 4 or 6 will be categorised as 'mental health' (category 1), responses 3, 5, 7 or 8 = pain (category 2), and all other responses (9, 10, 11, 12) are categorised as other (category 3). Here's a basic data frame to give you an idea:

Participant_ID <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)  
Primary_health_complaint <- c(3, 7, 12, 11, 3, 1, 9, 4, 6, 2)
cud1 <- data.frame(Participant_ID, Primary_health_complaint)

Then I would like a new column saying:

q2.2_healthCat <- c(2, 2, 3, 3, 2, 1, 3, 1, 1, 1)

Here's my attempt (using case_when this time):

cud1 <- cud1 %>% mutate(q2.2_healthCat = case_when(
primary_health_complaint = c(1,2,4,6), '1', 
primary_health_complaint = c(3,5,7,8), '2',
primary_health_complaint = c(9,10,11,12), '3')) 

Hope someone can help! Please be kind, as I'm new to R. I've had a look at many other posts and can't figure out what I'm doing wrong.

Edit: Found the solution here case_when in mutate pipe using something along these lines:

 require(data.table) ## 1.9.2+
 setDT(df)
 df[a %in% c(0,1,3,4) | c == 4, g := 3L]
 df[a %in% c(2,5,7) | (a==1 & b==4), g := 2L]
Michael
  • 1
  • 1
  • Never use `<-` in an `ifelse` statement, neither the conditional nor the yes/no arguments. It's difficult to imagine this can work when the "data" you give is not valid R syntax. Regardless, since you're using `dplyr` (*please* be explicit about non-base packages), look into `case_when`, it will greatly simplify your nested `ifelse` intentions. – r2evans Dec 04 '20 at 01:45
  • So just to clarify, you're saying it should be a = instead of a <-? And I just noticed that I forgot to put a c() before the vectors, which I guess is what you're referring to by invalid R syntax in my data. It probably would have been simpler just to tell me that explicitly. And yes, I am referring to the dplyr package, sorry for not specifying. Please remember that I'm still learning, so most of these words are still jargon to me! I had a go with case_when but still no luck unfortunately... – Michael Dec 04 '20 at 04:14
  • No. I'm saying the premise of *assignment* inside an `ifelse` is legal but rarely what is truly needed. (`=` will likely fail, though, because it will be interpreted as a named-argument, which is unlikely to match.) BTW, that use of `require` is incorrect, see https://stackoverflow.com/a/51263513/3358272. – r2evans Dec 04 '20 at 14:02

1 Answers1

0

Maybe you can try the following nested ifelse

within(cud, q2.2_healthCat <- ifelse(Primary_health_complaint %in% c(1, 2, 4, 6), 1, ifelse(Primary_health_complaint %in% c(3, 5, 7, 8), 2, 3)))

which gives

   Participant_ID Primary_health_complaint q2.2_healthCat
1               1                        3              2
2               2                        7              2
3               3                       12              3
4               4                       11              3
5               5                        3              2
6               6                        1              1
7               7                        9              3
8               8                        4              1
9               9                        6              1
10             10                        2              1
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81