1

I have data frame with column names 1990.x ..2000.x, 1990.y,..2000.y. I want to replace NAs in variables ending with ".x" with values from .y from corresponding year. It is element by element computation of formula 1990.x = 0.5+0.2*log(1990.y)

I wanted to do something like this:

for (v in colnames(df[ ,grepl(".x",names(df))])) {
  print(v)
  df$v <- ifelse(is.na(df$v), ols$coefficients[1]+ols$coefficients[2]*log(df$gsub(".x",".y",v)), df$v)
}

but this is not working. How can i make this loop working, or is there any better solution? Thanks

nov
  • 157
  • 1
  • 3
  • 15
  • It will help to provide some or all of your data frame, using _e.g._ `dput(df)`. I also wonder if the NAs arise due to a merge or join and could be avoided. – neilfws Jan 30 '19 at 23:34
  • You can avoid the loop and do `x_cols <- grep("x$", names(dat)); y_cols <- grep("y$", names(dat)); dat[x_cols][is.na(dat[x_cols])] <- 0.5 + 0.2*log(dat[y_cols][is.na(dat[x_cols])]); dat` – markus Jan 30 '19 at 23:37

1 Answers1

2

The $ operator is available for convenience, but can't be used inside of a for loop where the value of the column you're selecting is going to change, e.g, in your for loop. Your code will need to use the [ operator (open and closed square brackets) instead:

df[,v] <- ifelse(is.na(df[,v]), ols$coefficients[1]+ols$coefficients[2]*log(df$gsub(".x",".y",v)), df[,v])
Andrew Haynes
  • 2,612
  • 2
  • 20
  • 35