30

I need to migrate SQL queries written for MS SQL Server 2005 to Postgres 9.1.
What is the best way to substitute for CROSS APPLY in this query?

SELECT *
FROM V_CitizenVersions         
CROSS APPLY     
       dbo.GetCitizenRecModified(Citizen, LastName, FirstName, MiddleName,
BirthYear, BirthMonth, BirthDay, ..... ) -- lots of params

GetCitizenRecModified() function is a table valued function. I can't place code of this function because it's really enormous, it makes some difficult computations and I can't abandon it.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
user1178399
  • 1,028
  • 8
  • 17
  • 32
  • You don't need cross apply in Postgres. You can use a table function just like a function. Simply join them. –  Jul 13 '12 at 15:36
  • 1
    @a_horse_with_no_name - `CROSS APPLY` re-executes the TVF with correlated parameters rather than executing once and then joining the result. – Martin Smith Jul 13 '12 at 15:37
  • 1
    I realise this is ancient... @MartinSmith that is not necessarily the case on MSSQL if that function is of the inline-table-valued variety, see Paul White's write up on how the MSSQL query planner can sometimes optimize `apply` into a `join` : http://www.sqlservercentral.com/articles/APPLY/69954/ Since we don't see the original code here I am speculating that's what happened based on the comment re performance on Erwin's answer. – Davos Sep 24 '18 at 15:47

4 Answers4

41

In Postgres 9.3 or later use a LATERAL join:

SELECT v.col_a, v.col_b, f.*  -- no parentheses, f is a table alias
FROM   v_citizenversions v
LEFT   JOIN LATERAL f_citizen_rec_modified(v.col1, v.col2) f ON true
WHERE  f.col_c = _col_c;

Why LEFT JOIN LATERAL ... ON true?


For older versions, there is a very simple way to accomplish what I think you are trying to with a set-returning function (RETURNS TABLE or RETURNS SETOF record OR RETURNS record):

SELECT *, (f_citizen_rec_modified(col1, col2)).*
FROM   v_citizenversions v

The function computes values once for every row of the outer query. If the function returns multiple rows, resulting rows are multiplied accordingly. All parentheses are syntactically required to decompose a row type. The table function could look something like this:

CREATE OR REPLACE FUNCTION f_citizen_rec_modified(_col1 int, _col2 text)
  RETURNS TABLE(col_c integer, col_d text)
  LANGUAGE sql AS
$func$
SELECT s.col_c, s.col_d
FROM   some_tbl s
WHERE  s.col_a = $1
AND    s.col_b = $2
$func$;

You need to wrap this in a subquery or CTE if you want to apply a WHERE clause because the columns are not visible on the same level. (And it's better for performance anyway, because you prevent repeated evaluation for every output column of the function):

SELECT col_a, col_b, (f_row).*
FROM  (
   SELECT col_a, col_b, f_citizen_rec_modified(col1, col2) AS f_row
   FROM   v_citizenversions v
   ) x
WHERE (f_row).col_c = _col_c;

There are several other ways to do this or something similar. It all depends on what you want exactly.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • i used the query you proposed. now i'm shocked: the query executes more than a minute. in ms sql it takes less than a second O_O. – user1178399 Jul 13 '12 at 16:26
  • 1
    @user1178399: It's practically impossible to comment on that without knowing the many factors in play. I would speculate that the performance can be improved. – Erwin Brandstetter Jul 13 '12 at 16:36
  • I would suggest that the reason for the performance difference is that the original MSSQL query is probably _not_ executing the function for every row. The function is likely an inline-table-valued-function (ITVF) and the query optimizer has executed is as a `join` rather than a correlated query for every row. In that case using `lateral` is an unfair comparison. In any rdbms, executing a user-defined (in sql) function for every row is a terrible idea. There's a good example of how MSSQL query planner can optimize ITVF here: http://www.sqlservercentral.com/articles/APPLY/69954/ – Davos Sep 24 '18 at 15:44
28

Necromancing:
New in PostgreSQL 9.3:

The LATERAL keyword

left | right | inner JOIN LATERAL

INNER JOIN LATERAL is the same as CROSS APPLY
and LEFT JOIN LATERAL is the same as OUTER APPLY

Example usage:

SELECT * FROM T_Contacts 

--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1 
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989


LEFT JOIN LATERAL 
(
    SELECT 
         --MAP_CTCOU_UID    
         MAP_CTCOU_CT_UID   
        ,MAP_CTCOU_COU_UID  
        ,MAP_CTCOU_DateFrom 
        ,MAP_CTCOU_DateTo   
   FROM T_MAP_Contacts_Ref_OrganisationalUnit 
   WHERE MAP_CTCOU_SoftDeleteStatus = 1 
   AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID 

    /*  
    AND 
    ( 
        (__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo) 
        AND 
        (__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom) 
    ) 
    */
   ORDER BY MAP_CTCOU_DateFrom 
   LIMIT 1 
) AS FirstOE 
Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442
2

I like Erwin Brandstetter's answer however, I've discovered a performance problem: when running

SELECT *, (f_citizen_rec_modified(col1, col2)).*
FROM   v_citizenversions v

The f_citizen_rec_modified function will be ran 1 time for every column it returns (multiplied by every row in v_citizenversions). I did not find documentation for this effect, but was able to deduce it by debugging. Now the question becomes, how can we get this effect (prior to 9.3 where lateral joins are available) without this performance robbing side effect?

Update: I seem to have found an answer. Rewrite the query as follows:

select x.col1, x.col2, x.col3, (x.func).* 
FROM (select SELECT v.col1, v.col2, v.col3, f_citizen_rec_modified(col1, col2) func
FROM   v_citizenversions v) x

The key difference being getting the raw function results first (inner subquery) then wrapping that in another select that busts those results out into the columns. This was tested on PG 9.2

Joe Love
  • 5,594
  • 2
  • 20
  • 32
1

This link appears to show how to do it in Postgres 9.0+:

PostgreSQL: parameterizing a recursive CTE

It's further down the page in the section titled "Emulating CROSS APPLY with set-returning functions". Please be sure to note the list of limitations after the example.

Matthew Wood
  • 16,017
  • 5
  • 46
  • 35