2

I have a small problem, or acctually kind of large. I have a dataset with 3 variables that I use atm, lets call them var1, var2 and var3. In total I have over 3000 observations with NA values in every variable.

var1=age_1, var2=Yes/No and var3=age_2

What I want to do is if var2="Yes" the value from var1 should be coopied into var3. I have done it in this way:

var3[var2=="Yes"]<-var1

but I get the error message:

Error in var3[var2 == "Yes"] <-var1 :

NAs are not allowed in subscripted assignments

Someone has a quick solution how I can solve this?

Metrics
  • 15,172
  • 7
  • 54
  • 83
Akire
  • 169
  • 3
  • 13

2 Answers2

3

You can try

var3 <- ifelse(var2 == "Yes", var1, var3)
Luca Braglia
  • 3,133
  • 1
  • 16
  • 21
2

The error sounds like you have NA values in var2. You can test to see if sum(is.na(var))>0 So if you have missing values R doesn't want to guess if missing is the same as "Yes" or "No" so you get the error.

Also, by only indexing one side of the assignment, you're not necessarily matching up values across rows. So even in you fix your NA values, you'd likley get the number of items to replace is not a multiple of replacement length error.

One trick is to use which to remove NA values form the logical indexes and transform them into numeric indices. Then once you know the rows you want to replace, you can use the same index on both sides of the assignment.

var1 <- c(10, 20, 30)
var2 <- c("No", "Yes", "No")
var3 <- c(100, 200, 3000)

idx <- which(var2=="Yes")
var3[idx] <- var1[idx]

var3
# [1]  100   20 3000

or you can use the ifelse function which makes all these steps much easier

var3<-ifelse(var2=="Yes", var1, var3)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • Why the heck this answer (OP's command, in effect) doesn't work? https://stackoverflow.com/questions/5824173/replace-a-value-in-a-data-frame-based-on-a-conditional-if-statement It seems brackets can only be used to replace to a fixed number, not for a variable – luchonacho Sep 22 '21 at 21:17
  • @luchonacho I'm not sure which answer you are referring to specifically. If you meant this one, I've added some sample data that you can use to run and verify it works. – MrFlick Sep 22 '21 at 21:21
  • If you are talking about the `if(i %in% "B") junk$nm <- "b"` part, that's because once that `if` statement is true once, then `junk$nm <- "b"` runs which assigned "b" to every value in that column. That statement has no idea what the "current" row is. It doesn't use `i` – MrFlick Sep 22 '21 at 21:23
  • Sorry. i meant the one on the link I shared. Essentially, `var3[var2=="Yes"]<-var1`. I get message that replacement has different number of rows than data. – luchonacho Sep 22 '21 at 21:44
  • 1
    @luchonacho That's because you're not indexing `var1` at all. If all vectors are the same length, it should look like `var3[var2=="Yes"]<-var1[var2=="Yes"]` – MrFlick Sep 22 '21 at 21:47
  • got it. Thanks! – luchonacho Sep 27 '21 at 23:55