0

I've created an R function to check if several columns exist. So, I'm looking a smart way to do this.

setBD04 <- function(.data) {
  .data %>% 
    mutate(CIS_C = if("CIS_C" %in% colnames(.)) as.character(CIS_C) else "",
           NSO_C = if("NSO_C" %in% colnames(.)) as.character(NSO_C) else "",
           TID_C = if("TID_C" %in% colnames(.)) as.character(TID_C) else "",
           NID_C = if("NID_C" %in% colnames(.)) as.character(NID_C) else "",
           CCR_C = if("CCR_C" %in% colnames(.)) as.character(CCR_C) else "",
           TCR_C = if("TCR_C" %in% colnames(.)) as.character(TCR_C) else "",
           MON_C = if("MON_C" %in% colnames(.)) as.character(MON_C) else "",
           MORG_C = if("MORG_C" %in% colnames(.)) as.character(MORG_C) else "",
           NCPR_C = if("NCPR_C" %in% colnames(.)) as.character(NCPR_C) else "",
           FOT_C = if("FOT_C" %in% colnames(.)) as.character(FOT_C) else "",
           FCAN_C = if("FCAN_C" %in% colnames(.)) as.character(FCAN_C) else "",
           NCPA_C = if("NCPA_C" %in% colnames(.)) as.character(NCPA_C) else "",
           MCK_C = if("MCK_C" %in% colnames(.)) as.character(MCK_C) else "",
           MCI_C = if("MCI_C" %in% colnames(.)) as.character(MCI_C) else "") %>% 
    select(CIS_C, NSO_C, TID_C, NID_C, CCR_C, TCR_C, MON_C, MORG_C, NCPR_C,
           FOT_C, FCAN_C, NCPA_C, MCK_C, MCI_C) %>% 
    return()
}

As you should see, there is a repetitive behaviour in my mutate function. is there any way of functional programming to do this smarter?

Specifically, this code is repetitive by 14 times, according to the number of columns that I want to verify if it exists or create:

CIS_C = if("CIS_C" %in% colnames(.)) as.character(CIS_C) else ""
TarJae
  • 72,363
  • 6
  • 19
  • 66
Diego Pacheco
  • 193
  • 2
  • 2
  • 10
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 21 '21 at 05:15

2 Answers2

1

Define all the columns that you want in the final output (cols) and use setdiff to create blank columns which are not existing in .data.

setBD04 <- function(.data) {

  cols <- c('CIS_C', 'NSO_C', 'TID_C', 'NID_C', 'CCR_C', 'TCR_C', 'MON_C',
            'MORG_C', 'NCPR_C', 'FOT_C', 'FCAN_C', 'NCPA_C', 'MCK_C', 'MCI_C')
  .data[setdiff(cols, names(.data))] <- ''
  return(.data[cols])
}
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

I couldn't come up with any functional programming solution, maybe I need to spend more time on it, however, you could use a for loop for your purpose. By the way it will print a message when the column(s) doesn't/don't exist.

my_function <- function(data, my_vars) {
  for(x in my_vars) {
    if(x %in% names(data)) {
      data[[x]] <- as.character(data[[x]])
    } else {
      message(paste("Column", x, "does not exist"))
    }
  }
  data
}

If I could come up with any functional programming solution, I won't hesitate to let you know.

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41