2

I'm importing data via an API. I do not know beforehand the list of variables that will get exported. How can I add a check to see if a variable exists? If it does, I want the values to remain the same. If it does not exist, I would like to create it and set to NA.

This is the code for replication:

 # create dataframe
 df <- data.frame(col1 = 5:8, col2 = 9:12)
 # check if col3 exists in names (it doesn't), if it does keep the same values, if not set to NA
 df$col3 <- ifelse("col3" %in% names(df) == TRUE, df$col3 <- df$col3, NA)
 # same as above, but the variable col2 does exist
 df$col2 <- ifelse("col2" %in% names(df) == TRUE, df$col2 <- df$col2, NA)

Setting the values of the variable to NA when it does not exist works well. However, when the variable exists, I get a column of length 4 with the first value ("9") repeated and I want a column with 9:12.

  • Does this answer your question? [Adding column if it does not exist](https://stackoverflow.com/questions/45857787/adding-column-if-it-does-not-exist) – Johan Oct 07 '20 at 21:41

1 Answers1

2

Here is a solution how I would do it in base R:

df <- data.frame(col1 = 5:8, col2 = 9:12)
check_names <- c("col1", "col2", "col3")

for (name in check_names) {
  if (!name %in% colnames(df)) {
    df[, name] <- NA
  }
}

df
  col1 col2 col3
1    5    9   NA
2    6   10   NA
3    7   11   NA
4    8   12   NA

The reason why you see the 9 repeated is that ifelse returns "A vector of the same length and attributes (including dimensions and "class") as test". As your test condition is only of length 1, only 9 as the first element of df$col2 is returned. This element is then recycled because you basically assign df$col2 <- 9.

starja
  • 9,887
  • 1
  • 13
  • 28