0

I'm trying to add a new column in my data frame based on the condition between two variables. I have two columns visit.x and visit.y. I want to have a new column called a number of visits. so if someone has initial and first visit that means the number of visits is 2 and if there is NA in either visit.x or visit.y that means the number of visits is 1

I used the following code.

df3$number_visit<-"NA"
for(i in 1:nrow(df3)) 
{
  if(df3[i,c("Vistis.x")]   ==  "initial"
     & df3[i,c("Vistis.y")] ==  "first")
  {
    df3$number_visit[i] <- "2"
  }
  if(df3[i,c("Vistis.x")]   ==        "initial"
     & df3[i,c("Vistis.y")] ==      NA)
  {
    df3$number_visit[i] <- "1"
  }
  if(df3[i,c("Vistis.x")]   ==        NA
     & df3[i,c("Vistis.y")] ==      "first")
  {
    df3$number_visit[i] <- "1"
  }
} 

I got this error message

Error in if (df3[i, c("Vistis.x")] == "intial" & df3[i, c("Vistis.y")] ==  : 
  missing value where TRUE/FALSE needed

Can someone help me solve this issue Thank you

Mr.M
  • 111
  • 1
  • 9
  • You should use `is.na(df3[i,c("Vistis.y")])`. `NA` values in R cannot be compared using the `==` operator – Onyambu May 19 '20 at 01:17
  • `!is.na(df3$Vistis.x) + !is.na(df3$Vistis.y)` does this get you to the answer you need? where `0` means both are `NA` – Onyambu May 19 '20 at 01:22
  • 1
    You can also checkout `dplyr::case_when` and you shouldn't need the for loop. If I am reading correctly what you are trying to do, you could use something like `dplyr` (from `tidyverse`) to make it easier. – steveb May 19 '20 at 01:29
  • Mr. M: Please make a Minimal Reproducible Example. See https://stackoverflow.com/questions/5963269 . You _must_ include some sample data; 4-10 lines is usually enough. Include a sample of what you want the output to look like. We want to help you, but you've got to work with us. – David T May 19 '20 at 01:49

0 Answers0