1

I have table and sql (running on Oracle):

T (it's just an example, the table is huge)

a  b  c
-------
1  4  7
2  5  5
3  6  8

sql:

SELECT a, b, c
FROM t

union all

SELECT 'R',b,c
FROM t
WHERE b = c AND (condition to another tables, etc)

It returns:

1  4  7
2  5  5
3  6  8
R  5  5

Is it possible to avoid UNION here (and don't add JOIN)? In other words - is it possible to optimize the query to avoid Oracle look into table T twice?

Artsiom
  • 99
  • 3
  • 11
  • what means "avoid UNION" ? What is the desired output? – Florin Ghita Apr 01 '13 at 11:35
  • Aim is to reduce DB load. Actually, the table is huge and I want to avoid scan it twice. Join isn't an option here. – Artsiom Apr 01 '13 at 11:40
  • Desired output - the same (in question after words 'It returns') – Artsiom Apr 01 '13 at 11:42
  • What does `condition to another tables, etc` mean? – Tim Apr 01 '13 at 11:54
  • 1
    If your query is (pseudo-code) "Select name, condition from Fruit UNION select name, condition from fruit where condition = 'unripe'" your resultset is the entire set plus a subset. There is no way to get both parts of the resultset without scanning the entire set a second time to retrieve the subset from it. The full table will either be on disk or in a table-expression within your query, but either way, those tuples will have to be scanned again to apply the limiting condition (and then set your 'R' flag) for the subset. – Tim Apr 01 '13 at 11:58

1 Answers1

1

This will read your table once. The join is done with an auxiliary table which contains only two values(of course this in in memory - no I/O)

with t as(
  select '1' a,  '4' b,  '7' c from dual union all
  select '2',  '5',  '5' from dual union all
  select '3',  '6',  '8' from dual
)
select decode(aux.col,1,t.a,'R'), t.b, t.c 
from t
join (select '1' col from dual union all select '2' from dual) aux
on (aux.col='1' or t.b=t.c);

The query does not depend on '1' and '2'. It can be:

select decode(aux.col, 'bla', t.a,'R'), t.b, t.c 
from t
join (select 'bla' col from dual union all select 'otherbla' from dual) aux
on (aux.col='bla' or t.b=t.c);

UPDATE: Also, if number of b=c records is small, you can speed up your actual query creating an index:

 create index fbi on t (b-c);

and then replace in your query WHERE b = c with WHERE b - c = 0

UPDATE2 Just to get the ideea on how these queries are executed:

create table t(a varchar2(10), b varchar2(10), c varchar2(10));

insert into t 
select mod(dbms_random.random(),1000),
  mod(dbms_random.random(),1000),
  mod(dbms_random.random(),1000)
from dual
connect by level < 1000000;

exec DBMS_STATS.GATHER_TABLE_STATS('DEV','T');

--1
SELECT a, b, c
FROM t;
---------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   999K|    11M|   700   (3)|
|   1 |  TABLE ACCESS FULL| T    |   999K|    11M|   700   (3)|
---------------------------------------------------------------

--2
SELECT a, b, c
FROM t
union all
SELECT 'R',b,c
FROM t
WHERE b = c;
----------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  1009K|    11M|  1426  (53)|
|   1 |  UNION-ALL         |      |       |       |            |
|   2 |   TABLE ACCESS FULL| T    |   999K|    11M|   700   (3)|
|   3 |   TABLE ACCESS FULL| T    | 10000 |    97K|   726   (7)|
---------------------------------------------------------------- 


--3
select decode(aux.col, 'bla', t.a,'R'), t.b, t.c 
from t
join (select 'bla' col from dual union all select 'otherbla' from dual) aux
on (aux.col='bla' or t.b=t.c);


----------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 20990 |   368K|  1402   (3)|
|   1 |  NESTED LOOPS      |      | 20990 |   368K|  1402   (3)|
|   2 |   VIEW             |      |     2 |    12 |     4   (0)|
|   3 |    UNION-ALL       |      |       |       |            |
|   4 |     FAST DUAL      |      |     1 |       |     2   (0)|
|   5 |     FAST DUAL      |      |     1 |       |     2   (0)|
|   6 |   TABLE ACCESS FULL| T    | 10495 |   122K|   699   (3)|
----------------------------------------------------------------
--if the leading table is dual, can be used an /*+ordered*/ hint 
--after select clause


--4
create index fbi on t (b-c);
SELECT a, b, c
FROM t
union all
SELECT 'R',b,c
FROM t
WHERE b - c = 0;
--------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |  1009K|    11M|  1384  (51)|
|   1 |  UNION-ALL                   |      |       |       |            |
|   2 |   TABLE ACCESS FULL          | T    |   999K|    11M|   700   (3)|
|   3 |   TABLE ACCESS BY INDEX ROWID| T    | 10000 |   117K|   683   (1)|
|   4 |    INDEX RANGE SCAN          | FBI  |  4000 |       |     3   (0)|
--------------------------------------------------------------------------

Keep in mind that Oracle does not know to predict well on or join clauses like in 3, so better you force a desired execution path. You should test to choose between 2 and 3 and 4(with the back of the cost of the index).

Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
  • 1,2,3..etc are real data numbers. I don't know these numbers before executing an sql. – Artsiom Apr 01 '13 at 11:46
  • You don't need to hardcode anything. The aux table is a soup. You can write it as is. – Florin Ghita Apr 01 '13 at 11:49
  • This works perfectly for me. I'll look at the ex plan an tune somehow, but I believe it's the option. – Artsiom Apr 01 '13 at 12:14
  • How do you know `this is in memory -- no I/O`? The OP said the table was 'huge'. You're scanning the full set twice in any case, once to get the full set, and then to get the subset based on a condition, assigning the 'R' flag to rows in the subset where b=c. So it is not possible `to avoid Oracle look into table T twice`. Those tuples ARE being evaluated a second time. Also, he wasn't selecting constants but column-values: OP said `I don't know these numbers before executing an sql`. – Tim Apr 01 '13 at 13:46
  • @Tim Oh, no, the `dual union dual` is in memory, not the table t :) – Florin Ghita Apr 02 '13 at 06:39
  • A full table scan is "hidden" in `create index fbi on t (b-c);` :-) You are "pre-processing" the table. – Tim Apr 02 '13 at 11:23
  • @Tim That's not the downside of the index. You make a full scan and sort to create the index and then select from table many times. The downside of the index is that inserts and updates are much more slower, as I specified in the answer - there is a cost. – Florin Ghita Apr 02 '13 at 11:45