0

I am working to update an old dataframe with a data from a new dataframe.

I found this option, it works for some of the fields, but not all. Not sure how to alter that as it is beyond my skill set. I tried removing the is.na(x) portion of the ifelse code and that did not work.

df_old <- data.frame(
      bb = as.character(c("A", "A", "A", "B", "B", "B")),
      y = as.character(c("i", "ii", "ii", "i", "iii", "i")),
      z = 1:6,
      aa = c(NA, NA, 123, NA, NA, 12))

df_new <- data.frame(
      bb = as.character(c("A", "A", "A", "B", "A", "A")),
      z = 1:6,
      aa = c(NA, NA, 123, 1234, NA, 12))

cols <- names(df_new)[names(df_new) != "z"]

df_old[,cols] <- mapply(function(x, y) ifelse(is.na(x), y[df_new$z == df_old$z], x), df_old[,cols], df_new[,cols])

The code also changes my bb variable from a character vector to a numeric. Do I need another call to mapply focusing on specific variable bb?

mtotof
  • 69
  • 9

1 Answers1

0

To update the aa and bb columns you can approach this using a join via merge(). This assumes column z is the index for these data frames.

# join on `z` column
df_final<- merge(df_old, df_new, by = c("z"))
# replace NAs with new values for column `aa` from `df_new`
df_final$aa <- ifelse(is.na(df_final$aa.x), df_final$aa.y, df_final$aa.x)
# choose new values for column `bb` from `df_new`
df_final$bb <- df_final$bb.y
df_final<- df_final[,c("bb", "z", "y", "aa")]

df_final
  bb z   y   aa
1  A 1   i   NA
2  A 2  ii   NA
3  A 3  ii  123
4  B 4   i 1234
5  A 5 iii   NA
6  A 6   i   12
EJJ
  • 1,474
  • 10
  • 17