5

I'm struggling to find an examples of a PL/R function that could accepts two postgres tables. PL/R docs does not provide any such example.
To have a working examples lets considering using merge of two postgres tables on R side.

Having two tables in postgres

CREATE TABLE x (a numeric, b text);
CREATE TABLE y (a numeric, d text);
INSERT INTO x VALUES (1, 'a'),(2, 'b');
INSERT INTO y VALUES (2, 'b'),(3, 'c');

I'm looking to replace following query

SELECT * FROM x INNER JOIN y ON x.a=y.a;

With the PL/R function defined in R as:

my_function = function(x, y){
    merge(x, y, by = "a")
}

I was able to call PL/R function which accepts single table, but not two.

jangorecki
  • 16,384
  • 4
  • 79
  • 160
  • 1
    This is not impossible in (postgres) SQL, you just need dynamic SQL: generate an SQL query which uses the table- and column-names, and execute this query, returning the resulting rows. ( the `by= "a"` form of parameter passing probably needs to be changed a bit) – wildplasser Mar 23 '16 at 14:22

1 Answers1

2

I don't think that postgeql can accept real table as parameters.

But there is another way to do it. You can pass the table names as parameters.
Here is the function

CREATE OR REPLACE FUNCTION merge(t1 text, t2 text)
returns setof x as 
   BEGIN
     execute 'select * from ' || t1 ||' join ' || t2 || ' on t1.a=t2.a';
   END

The above is the function for postgresql, it can also written in the R function.
Here is the code for R

  1. We have to store all the values of table x into a variable x. See the codes below

    x <- dbGetQuery(con, "SELECT * from sandbox.x") --con is the connection which connect to your db, sandbox is the schema name, x is the table name

  2. Store values of table y into a variable y

    y<-dbGetQuery(con, "SELECT * from sandbox.y")

  3. merge the 2 tables

    total <- merge(x,y,by="a")

  4. You can also write another function to wrapper the merge function, see the codes below

    myTotal <- function(x,y) { result <- merge(x,y,by="a") return(result) }

I attached a screenshot of the steps for your reference

enter image description here

jangorecki
  • 16,384
  • 4
  • 79
  • 160
buqing
  • 925
  • 8
  • 25
  • There do have function which you merge two table in R codes. First, you have to create 2 data frames (datasets), the data frames should match the definition of your tables. Then you can use the build-in function merge, see the codes below total <- merge(data frameA,data frameB,by="a"). Is this what you want? – buqing Mar 23 '16 at 13:23
  • I have update my answer, you can take a look at. let me know if you have any questions – buqing Mar 23 '16 at 14:01
  • @Robinson I came up with the same answer. You beat me to it. There are packages for connecting to postgres such as RPostgreSQL. PL/R seem a little light on support for other OS's. i can't for the life of me see why the name of a table is not enough, since the Sql is executed as text. As a side note "MERGE is not in 9.4.5 (the latest PostgreSQL release as of 2015-10-08) UPSERT functionality will be in the PostgreSQL 9.5 release -- see What's new in PostgreSQL 9.5 " Ref https://wiki.postgresql.org/wiki/SQL_MERGE –  Mar 23 '16 at 15:04
  • @pnorton when I refer table in my posts, I am referring the table object(which contains table definition, including primary key, constrains and data) not the table name. To be more clear, jangorecki want to do the join in the R side not postgresql. It is possible. I have updated the codes in the R side. – buqing Mar 23 '16 at 15:20
  • Oh dear, I'm RPostgreSQL power-user already. I've asked question expecting to get precise answer. @pnorton I've already incorporated UPSERT using RPostgreSQL in R package [pg](https://github.com/jangorecki/pg) using staging table behind the scene. – jangorecki Mar 23 '16 at 17:50
  • @Robinson it doesn't really answer the question – jangorecki Oct 05 '16 at 22:16