2

I am just beginning to discover all the advantages of using R & monetDB/monetDBLite package - great stuff!
I was curious however if it's possible to use a custom defined function in dplyr's mutate() i.e.

# Connect to monetDBLite
db <- src_monetdb(embedded = db.dir)
data <- tbl(db, "my.table")

# Try to make a new column using custom defined function  
data %>% group_by(colX) %>% mutate(NewCol = customFunc())  

Results in the following error:

Error in .local(conn, statement, ...) : 
Unable to execute statement 'SELECT colA colB colC...'.
Server says 'ParseException:SQLparser:SELECT: no such operator 'customFunc''.

It seems like the SQL parser has a problem finding the custom function. Am I overseeing something (declare R code in a SQL function definition etc.) ? Is it possible to use custom defined functions with the monetDBLite package like this?

davidski
  • 561
  • 1
  • 4
  • 16
  • How is `customFunc` defined? Generally you need to pass summary functions an argument, i.e. a column to operate on. – alistaire Feb 23 '16 at 18:48
  • 2
    This is not yet supported. Throw in a `collect()` in between perhaps? Transfers should be pretty fast, especially when you `select()` only the columns you need before. – Hannes Mühleisen Feb 24 '16 at 06:28
  • @HannesMühleisen yes, that works pretty good. Any plans on implementing this in the future? Also, does it make a difference that I'm using the `monetDBLite` package - is this possible with a 'real' `monetDB` setup? Thank you and keep up the great work! – davidski Feb 24 '16 at 13:10
  • In addition dplyr can always send arbitrary SQL to the database, as explain in the [dplyr database vignette](https://cran.r-project.org/web/packages/dplyr/vignettes/databases.html): `tbl(my_db, sql("SELECT * FROM flights"))`. This looses all the benefit of dplyr verbs but it's possible. – Paul Rougieux Jan 24 '17 at 10:43
  • yes you are right but i guess this is more aimed at using some DB functions not covered by `dplyr` or using functions defined for some DBs but not for others. There is an example with using `%like%` in the vignette you mention. – davidski Jan 24 '17 at 10:58

1 Answers1

0

Since @Hannes answered my question in the comments - i'm adding it here with some other stuff i picked up:

  • one can use the basic dplyr verbs (select, filter, group_by & mutate) in conjunction with some other "basic" R functions. There is a list of the supported functions, and more on this, in the dplyr databases vignette under the section "SQL translation". So, using the example from my question above - one can do:
    data %>% group_by(col_x) %>% mutate(new_col = log10(some_old_col)).

  • MonetDBLite does not support UDFs (e.g. my question here - if it's possible to use my UDF inside a mutate call) ...

  • ... however the "full" MonetDB does support used defined functions to be sent to the db for computation. It is all well explained on their homepage.

davidski
  • 561
  • 1
  • 4
  • 16