1

I have the following block of code that needs to be repeated often:

flights <- fread("https://raw.githubusercontent.com/wiki/arunsrinivasan/flights/NYCflights14/flights14.csv")

flights$origin %>% table() 
flights[grepl("jfk", origin, ignore.case = TRUE),
        origin := "0",
      ][grepl("ewr|lga", origin, ignore.case = TRUE),
        origin := "1",
      ][, origin := as.numeric(origin)] 
flights$origin %>% table()

Here is my attempt at wrapping this in a function that allow me to have n number of regex expressions and replacements for those for any given column in the data set.

my_function <- function(regex, replacement, column) {   
    flights[, column, with = FALSE] %>% table()   
    for (i in seq_along(regex)) {
        responses[grepl(regex[i], column, ignore.case = TRUE), 
                  column := replacement[i],
                  with = FALSE]   
    }   
    flights[, column := as.numeric(column)]
    flights[, column, with = FALSE] %>% table() 
}

But this spits the following warning message:

Warning messages:
1: In `[.data.table`(flights, grepl(regex[i], column, ignore.case = TRUE),  :
  with=FALSE together with := was deprecated in v1.9.4 released Oct 2014. Please wrap the LHS of := with parentheses; e.g., DT[,(myVar):=sum(b),by=a] to assign to column name(s) held in variable myVar. See ?':=' for other examples. As warned in 2014, this is now a warning.
2: In eval(jsub, SDenv, parent.frame()) : NAs introduced by coercion

Any help would be appreciated. Many thanks.

fahmy
  • 3,543
  • 31
  • 47
  • 1
    [Please make your example reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it easier for others to help you. – Jaap Aug 12 '18 at 11:57

1 Answers1

1

Figured it out, will leave my solution here for everyone else's benefit.

  1. Instead of using with = FALSE use () to refer to column by name.
  2. To pass the column to another function (grepl() in my case) use get() function.

my_function <- function(regex, # Vector of regex strings to match
                    replacement, # Vector of strings to replace the matches from 'regex' arg
                    column, # Column to operate on 
                    as.numeric = FALSE # Optional arg; convert 'column' to numeric as final step?
) {  
  cat("Converting:..")
  responses[, column, with = FALSE] %>% table() %>% print
  for (i in seq_along(regex)) {
    responses[grepl(regex[i], get(column), ignore.case = TRUE, perl = TRUE), 
              (column) := replacement[i]]   
  }
  if (as.numeric) {
    responses[, (column) := as.numeric(get(column))]
  }
  cat("to:..")
  responses[, column, with = FALSE] %>% table() %>% print
}
fahmy
  • 3,543
  • 31
  • 47