6

In dplyr running on R data frames, it is easy to run

df <- df %>% 
    mutate(income_topcoded = ifelse(income > topcode, income, topcode)

I'm now working with a large SQL database, using dplyr to send commands to the SQL server. When I run the same command, I get back

Error in postgresqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not Retrieve the result : ERROR:  
function ifelse  (boolean, numeric, numeric) does not exist
HINT:  No function matches the given name and argument types. You may need to add explicit type casts.

How would you suggest implementing ifelse() statements? I'd be fine with something in PivotalR (which seems to support ifelse(), but I don't know how to integrate it with dplyr and couldn't find any examples on SO), some piece of SQL syntax which I can use in-line here, or some feature of dplyr which I was unaware of.

(I have the same problem that I'd like to use grepl() as an in-db operation, but I don't know how to do so.)

crazybilly
  • 2,992
  • 1
  • 16
  • 42
ganong
  • 301
  • 2
  • 13
  • Can you rephrase your question to make it a specific, programming question? As it stands right now it has a good chance of being closed as off-topic (looking for a tool, tutorial...). – Roman Luštrik Mar 29 '15 at 14:20
  • Hi Roman, thanks for the tip! Just rewrote it. If I have questions of the form "looking for a tool" in the future, where would you suggest I post them? – ganong Mar 29 '15 at 17:57
  • I really wouldn't know. Perhaps a mailing list that deals in a similar topic? I usually just do an internet search or ask in the R chat room here on SO. – Roman Luštrik Mar 30 '15 at 13:12

2 Answers2

6

Based on @hadley's reply on this thread, you can use an SQL-style if() statement inside mutate() on dplyr's in-db dataframes:

df <- df %>% 
    mutate( income_topcoded = if (income > topcode) income else topcode)

As far as using grepl() goes...well, you can't. But you can use the SQL like operator:

df  <- df %>%
    filter( topcode %like% "ABC%" )
Community
  • 1
  • 1
crazybilly
  • 2,992
  • 1
  • 16
  • 42
1

I had a similar problem. The best I could do was to use an in-db operation as you suggest:

topcode <- 10000
queryString <- sprintf("UPDATE db.table SET income_topcoded = %s WHERE income_topcoded > %s",topcode,topcode)
dbGetQuery(con, queryString)

In my case, I was using MySQL with dplyr, but it wasn't able to translate my ifelse() into valid SQL.

D. Woods
  • 3,004
  • 3
  • 29
  • 37
  • this is pretty clean and i will use it going forward. thanks! (not giving it a check since I still hope that dplyr's creators will build this functionality in the future) – ganong Oct 28 '15 at 13:03