-1

I have two tables as below

Table A

Key    Field1  Created_date  
1234    a      Jan-18  
1234    b      Feb-18  
1234    c      Mar-18  
4567    d      Mar-18  
6789    e      Apr-18  

Table B

Key Posted_date
1234    Mar-18
4567    Apr-18

I want to join table A with B on the field Key and select records with max(A.created_date) where A.created_date < B.posted_date. The table B has around 300 million records. Could you please help me out.

Result

Key Field1  Created_date  
1234    b   Feb-18  
4567    d   Mar-18 
SreeVik
  • 43
  • 8

1 Answers1

0

Various different ways to solve this. Here's a solution which breaks it down into two clear steps so you can understand the principles.

The subquery uses analytic ranking to score the created_date for each key record in table A that matches the criteria in table B. The outer filter selects the records from table A for each maximum created_date.

select a.key
       , a.field1
       , a.created_date
from (
  select a.*
         , rank() over (partition by a.key
                        order by a.created_date desc) as rnk
  from b
       join a on a.key = b.key
  where a.created_date < b.posted_date
  ) a
where a.rnk = 1
/

Presumably if table B has 300 million records then table A has more. You will have to experiment to get a version that's sufficiently performative for your needs. For instance, with tables that big we would hope to have licensed the Partitioning option; that might affect how you approach the join and filtering.

APC
  • 144,005
  • 19
  • 170
  • 281