1

To be honest I am not sure if this is because of joining tables with DB links.

Here is my query:

SELECT
DISTINCT
AA.NAME,
DD.TOURNAMENT_TIME, 
BB.CLUB_TYPE 
FROM
ROOT.test_adviser@123test.com CC
INNER JOIN 
ROOT.test_club@123test.com BB
ON BB.adviser_id = CC.adviser_id
INNER JOIN ROOT.test_student@123test.com AA
ON AA.student_id = BB.student_id
INNER JOIN test_tournament DD
ON DD.tournament_id = AA.tournament_id AND DD.slot_number = BB.slot_number

Query above runs over 20 minutes. If this is due to using JOIN inefficiently any idea how I could clean the query? Any help is greatly appreciated.

Other information

Table DD - has 7 rows ( This is the count I hope to return with the columns from other tables joined)

Table AA, CC, BB - has more than 60k rows.

Eseresaure
  • 69
  • 1
  • 5
  • Can you show table schemas too? Probably missing index on some table... – KuKeC May 11 '18 at 13:04
  • The problem is likely due to the database links. This makes it harder for Oracle to optimize the query. – Gordon Linoff May 11 '18 at 13:10
  • Please learn how to document [performance problems in the database](https://stackoverflow.com/questions/34975406/how-to-describe-performance-issue-in-relational-database?answertab=active#tab-top). The first thing is the execution plan, you'll see at what site the statement is executed and what data is transfered over network. This give the first clue about the problem cause. – Marmite Bomber May 11 '18 at 13:18

2 Answers2

0

You can try to create local copies of all tables accessed by db link

create table test_anviser as
  select * from ROOT.test_adviser@123test.com;

Think if these test tables in a class environment they shouldn't be too big.

Then replace each table name with @db-link with just table name

ROOT.test_adviser@123test.com -> test_adviser

Even there are no indexes on local copies you should get the result much faster.

oreh
  • 1,129
  • 1
  • 9
  • 16
  • I'd rather create a *materialized view* which can be refreshed, not a table. Also, generally speaking, I'd rather index appropriate column(s) than run my queries without them. – Littlefoot May 11 '18 at 20:36
  • This is not the best advice possible. Much simpler is to copy only the 7 rows table `TEST_TOURNAMENT` to the remote site, connect there and run the query locally. – Marmite Bomber May 12 '18 at 18:03
0

You provided close to no information required to diagnose the query. Therefor it is not possible to give you an answer. Instead I provide some advise that should help you to fix the performance problem.

First you need to ...

Understand the Query

The best way is to draw the tables and the join paths as shown below.

enter image description here

You see that the small table on the left with 7 rows (local site) is joined with the two of the remote tables, that are all joined together. How can Oracle perform this join? There are two possibilities.

The simplest way is that Oracle process the seven rows local table in a nested loop join and gets the required records using a remote query.

The second way is to pass the 7 rows to the remote site and perform the whole query there.

To see wich option is bettwer you need first to...

Understand the Execution Plan

Using EXPLAIN PLAN you may check what is Oracle actually doing. I'm showing plan on generated data based on your description.

EXPLAIN PLAN  SET STATEMENT_ID = 'jara1' into   plan_table  FOR
SELECT /*+  driving_site(CC) USE_HASH(AA BB CC)  */
DISTINCT
AA.NAME,
DD.TOURNAMENT_TIME, 
BB.CLUB_TYPE 
FROM
test_adviser@MY_LINK CC
INNER JOIN 
test_club@MY_LINK BB
ON BB.adviser_id = CC.adviser_id
INNER JOIN test_student@MY_LINK AA
ON AA.student_id = BB.student_id
INNER JOIN test_tournament DD
ON DD.tournament_id = AA.tournament_id AND DD.slot_number = BB.slot_number;


SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'jara1','ALL')); 


---------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name            | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                 |     1 |    88 |    33   (4)| 00:00:01 |        |      |
|   1 |  HASH UNIQUE          |                 |     1 |    88 |    33   (4)| 00:00:01 |        |      |
|   2 |   NESTED LOOPS        |                 |     1 |    88 |    32   (0)| 00:00:01 |        |      |
|   3 |    NESTED LOOPS       |                 |     1 |    75 |    31   (0)| 00:00:01 |        |      |
|   4 |     NESTED LOOPS      |                 |     7 |   217 |    17   (0)| 00:00:01 |        |      |
|   5 |      TABLE ACCESS FULL| TEST_TOURNAMENT |     7 |    98 |     3   (0)| 00:00:01 |        |      |
|   6 |      REMOTE           | TEST_STUDENT    |     1 |    17 |     2   (0)| 00:00:01 | MY_LI~ | R->S |
|   7 |     REMOTE            | TEST_CLUB       |     1 |    44 |     2   (0)| 00:00:01 | MY_LI~ | R->S |
|   8 |    REMOTE             | TEST_ADVISER    |     1 |    13 |     1   (0)| 00:00:01 | MY_LI~ | R->S |
---------------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

   6 - SELECT "STUDENT_ID","TOURNAMENT_ID","NAME" FROM "TEST_STUDENT" "AA" WHERE 
       :1="TOURNAMENT_ID" (accessing 'MY_LINK' )

   7 - SELECT "STUDENT_ID","SLOT_NUMBER","ADVISER_ID","CLUB_TYPE" FROM "TEST_CLUB" "BB" WHERE 
       :1="SLOT_NUMBER" AND :2="STUDENT_ID" (accessing 'MY_LINK' )

   8 - SELECT "ADVISER_ID" FROM "TEST_ADVISER" "CC" WHERE :1="ADVISER_ID" (accessing 'MY_LINK' )

So you see this is the first option described above. Local NESTED LOOPs followed with remote access for each ID.

Index Definition

Note that those indexes are required for this query

create index test_student_idx1 on test_student(tournament_id);
create index test_club_idx1 on test_club(slot_number);
create index test_club_idx2 on test_club(student_id);
create index test_adviser_IDX1 on test_adviser(adviser_id);

Is this a good choise? Well it maight be - in case that remote queries return very small number of rows.

So basically it boilds down to the question how many rows there are with the selected TOURNAMENT_ID and SLOT_NUMBER.

You may quickly check it with following query:

select AA.tournament_id, count(*) from test_student@MY_LINK AA
INNER JOIN test_tournament DD
ON DD.tournament_id = AA.tournament_id
group by aa.tournament_id;

If the query (and the analogous one for the CLUB table) returns a high number of rows (which a suspect) this access is not optimal. For example is you see a 1000 recors per key in both queries, you may expect one million (1000*1000) remote access calls.

In that case you must switch to the second option - complete remote query. The DRIVING_SITE hint is you friend. The hint specifies, that the whole query should be done on the remote site. So the table test_tournament will be moved the the remote site and teh query will be exceuted there.

The only difference is, that Oracle performs not so advanced optimization of the execution plan as in case of the local query.

It might be, that even this will lead to a suboptimal plan (though better than the pervious one) with the dying in the nested loop effect. Than you simple force the HASH_JOIN using a hint.

So add to your query one of the following hints and observe the behavior.

 SELECT /*+  driving_site(CC) */

 SELECT /*+  driving_site(CC) USE_HASH(AA BB CC) */

Test which is better for your data.

This is the execution plan using the HASH join. Note, that the plan is executed complete on the remote site. Plans with HASH JOIN does not return instanly, but the big advantage is, they doesn't end with extreme long elapsed times as in case of high volume NESTED LOOPs joins.

------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|                   |     1 |    55 |  1328   (1)| 00:00:01 |        |      |
|   1 |  HASH UNIQUE           |                   |     1 |    55 |  1328   (1)| 00:00:01 |        |      |
|*  2 |   HASH JOIN SEMI       |                   |     1 |    55 |  1327   (1)| 00:00:01 |        |      |
|*  3 |    HASH JOIN           |                   |     1 |    50 |  1289   (1)| 00:00:01 |        |      |
|*  4 |     HASH JOIN          |                   |     8 |   264 |  1148   (1)| 00:00:01 |        |      |
|   5 |      REMOTE            | TEST_TOURNAMENT   |     7 |    98 |     2   (0)| 00:00:01 |      ! | R->S |
|   6 |      TABLE ACCESS FULL | TEST_CLUB         |   130K|  2412K|  1146   (1)| 00:00:01 | REMOT~ |      |
|   7 |     TABLE ACCESS FULL  | TEST_STUDENT      |   130K|  2158K|   141   (1)| 00:00:01 | REMOT~ |      |
|   8 |    INDEX FAST FULL SCAN| TEST_ADVISER_IDX1 | 60000 |   292K|    38   (0)| 00:00:01 | REMOT~ |      |
------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A3"."ADVISER_ID"="A4"."ADVISER_ID")
   3 - access("A1"."TOURNAMENT_ID"="A2"."TOURNAMENT_ID" AND "A2"."STUDENT_ID"="A3"."STUDENT_ID")
   4 - access("A1"."SLOT_NUMBER"="A3"."SLOT_NUMBER")


Remote SQL Information (identified by operation id):
----------------------------------------------------

   5 - SELECT "TOURNAMENT_ID","SLOT_NUMBER","TOURNAMENT_TIME" FROM "TEST_TOURNAMENT" "A1" 
       (accessing '!' )


Note
-----
   - fully remote statement
Marmite Bomber
  • 19,886
  • 4
  • 26
  • 53