3

I'm trying to teach myself and better understand alternative methods to a UNION and discover when I can use joins.

As I'm playing with this I can't seem to get what I want without a UNION. Is it possible to write this in a single query?

SELECT DISTINCT a.id
  FROM table1 a, table2 b
 WHERE a.id = b.id
       AND a.ind IS NULL
       AND b.year >= '2017'
       AND b.code IN ('01','02','03')
       AND b.flag NOT IN ('F','L')
UNION
SELECT DISTINCT a.id
  FROM table1 a, table3 c
 WHERE a.id = c.id
       AND a.ind IS NULL
       AND c.area = 'MAIN'
       AND SYSDATE >= c.start

Thanks in advance for any guidance or help.

Zilch
  • 93
  • 1
  • 1
  • 5
  • 1
    What you have now _is_ a single query, though it uses a union. I think an alternative would involve joining all three tables together, but then the `WHERE` logic could get truly bufugly (if you don't know what that means, it means beyond ugly). – Tim Biegeleisen Mar 28 '18 at 13:32
  • No need for `DISTINCT` in your queries as `UNION` will remove any duplicates. – Magnus Mar 28 '18 at 13:33
  • You could have provided an SQL Fiddle with some test data. – Gerard H. Pille Mar 28 '18 at 13:45

5 Answers5

8
SELECT DISTINCT a.id
FROM table1 a
LEFT JOIN table2 b on b.id = a.id AND b.year >= '2017'
     AND b.code IN ('01', '02', '03') AND b.flag NOT IN ('F', 'L') 
LEFT JOIN table3 c ON a.id = c.id and c.area = 'MAIN' and SYSDATE >= c.start
WHERE a.ind IS NULL
    AND ( b.id IS NOT NULL or c.id IS NOT NULL)

This is one of those things where the old obsolete A,B join syntax really shows it's age, when you have some conditions that must go in a specific ON clause and others that must go in the WHERE clause. It would be very difficult to write this query that way, and even harder to read and understand it later. Better to always write out the full INNER JOIN, LEFT JOIN, etc.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • Two outer joins followed by a couple of "is not null" to turn them into inner joins again. Why not? – Gerard H. Pille Mar 28 '18 at 13:55
  • 1
    @GerardH.Pille It's called an exclusion join. It's a common pattern that works quite well, and definitely does **not** just turn things back to an inner join again. The normal pattern would just check `IS NULL` instead of `IS NOT NULL`, but in this case doing the pattern twice with an `OR` the results are still correct. – Joel Coehoorn Mar 28 '18 at 13:56
  • An exclusion join! Which proves one is never too old to learn. I still think you're wasting time. – Gerard H. Pille Mar 28 '18 at 14:14
5

Whenever you see a distinct, resources have been wasted.

SELECT a.id
  FROM table1 a
  where a.ind IS NULL
    and (   exists (select null from table2 b
                      WHERE a.id = b.id
                        AND b.year >= '2017'
                        AND b.code IN ('01','02','03')
                        AND b.flag NOT IN ('F','L') )
         or exists (SELECT null FROM table3 c
                      WHERE a.id = c.id
                        AND c.area = 'MAIN'
                        AND SYSDATE >= c.startdt)
        )

Indexes on table2 (id,year,code,flag) and table3 (id,area,startdt) won't hurt performance. Oracle 11gR2 did not allow me to have a column named "start".

Gerard H. Pille
  • 2,528
  • 1
  • 13
  • 17
  • I tend to use IN instead of EXISTS as it is shorter and usually easier to read. You must be careful and (almost) always use NOT EXISTS though. – gpeche Mar 28 '18 at 13:47
  • @gpeche "(almost) always use NOT EXISTS though" ? – Magnus Mar 28 '18 at 13:50
  • @gpeche you seem new to Oracle? The exists had the advantiage to stop as soon as one row was found, while the in will create a view of all rows that satisfy the conditions, sort and distinct them. Of course, with the cost based floptimizer, you're never sure. – Gerard H. Pille Mar 28 '18 at 14:03
  • 1
    @GerardH.Pille - you seem to be too old to Oracle. What you describe changed a long time ago; at least since version 10, `IN` and `EXISTS` result in the same execution plan. –  Mar 28 '18 at 14:14
  • @Gerard H. Pille using Oracle since about 2001, it's been a looooong time since I have seen any difference in execution plans for IN vs EXISTS. Also, read http://www.oracle.com/technetwork/issue-archive/2006/06-may/o36asktom-099107.html: `You would instead consider using either WHERE EXISTS or WHERE IN. (The cost-based optimizer [CBO] sees them as more or less equivalent—you might, too.)` and `In both cases, the optimizer would employ either a semi join, which you cannot specify but the optimizer may perform (it stops joining after the first hit)` – gpeche Mar 28 '18 at 14:16
  • 1
    @Magnus read https://stackoverflow.com/questions/173041/not-in-vs-not-exists . Basically `IN` and `EXISTS` are "equivalent", but `NOT IN` and `NOT EXISTS` behave differently in presence of `NULL`. In most cases, the "intuitive" behaviour is almost always the one provided by `NOT EXISTS`. – gpeche Mar 28 '18 at 14:21
  • @gpeche I'm well aware. Different behavior for use in different situations. – Magnus Mar 28 '18 at 14:58
2

I would use this

select a.id
from table1 a
where
  a.ind is null and (
  a.id in (
    select b.id
    from table2 b
    where
    b.year >= '2017'
    and b.code IN ('01','02','03')
    and b.flag NOT IN ('F','L')  
  ) or
  a.id in (
    select c.id
    from table3 c
    where
    c.area = 'MAIN'
    and sysdate >= c.start
  )
)

DISTINCT + table joins can be rewritten as IN/EXISTS most of the times. In several databases it might even select a better execution plan (e.g: Oracle)

gpeche
  • 21,974
  • 5
  • 38
  • 51
0

Please let me know if using Full outer join makes sense

with cte as (
select 1 as nu union 
select 2 as nu union 
select 3 as nu union 
select 4 as nu union 
select 5 as nu union 
select 6 as nu )
,cte2 as (
select 1 as nu union 
select 2 as nu union 
select 3 as nu )
,cte3 as (
select 7 as nu union 
select 8 as nu union 
select 9 as nu )
select coalesce(cte.nu,cte2.nu,cte3.nu)
from cte
full outer join cte2 on cte.nu = cte2.nu
full outer join cte3 on cte.nu = cte3.nu
SSMan
  • 1
-2

Please check if this works

SELECT DISTINCT a.id
  FROM table1 a, table2 b, table3 c
 WHERE (a.id = b.id
        OR a.id = c.id)
       AND a.ind IS NULL
       AND (( b.year >= '2017'
       AND b.code IN ('01','02','03')
       AND b.flag NOT IN ('F','L'))
       OR (c.area = 'MAIN'
       AND SYSDATE >= c.start))
Avani
  • 205
  • 1
  • 6