3

I have two tables that I'm trying to join and insert into another table based on a combination of three columns. I'll explain.

Table M

| ANO      | BNO     |         Timestamp   |  Duration
---------------------------------------------------------------
|  5612853 | 4732621 | 21.11.2013 09:50:58 | 196
|  4842988 | 5610953 | 21.11.2013 17:34:58 | 98
|  7765759 | 5612853 | 21.11.2013 20:48:00 | 377
|  2470321 | 2470263 | 21.11.2013 21:47:18 | 125

Table N

| ANO      | BNO     |         Timestamp   |  Duration
---------------------------------------------------------------
|  5612853 | 4732621 | 21.11.2013 09:50:52 | 196
|  4842988 | 5610953 | 21.11.2013 17:34:53 | 98
|  7765759 | 5612853 | 21.11.2013 20:47:55 | 377
|  2470321 | 2470263 | 21.11.2013 21:47:13 | 125

Now these two tables have to be matched and inserted into Table MN based on

M.ANO=N.ANO and M.BNO=N.BNO and ((M.TIMESTAMP = N.TIMESTAMP+5/86400) or (M.TIMESTAMP = N.TIMESTAMP+6/86400))

So in theory, my output Table MN should be

| ANO      | BNO     |         Timestamp   |  Duration  || ANO      | BNO     |         Timestamp   |  Duration
--------------------------------------------------------------------------------------------------------------
|  5612853 | 4732621 | 21.11.2013 09:50:58 | 196        ||  5612853 | 4732621 | 21.11.2013 09:50:52 | 196
|  4842988 | 5610953 | 21.11.2013 17:34:58 | 98         ||  4842988 | 5610953 | 21.11.2013 17:34:53 | 98
|  7765759 | 5612853 | 21.11.2013 20:48:00 | 377        ||  7765759 | 5612853 | 21.11.2013 20:47:55 | 377
|  2470321 | 2470263 | 21.11.2013 21:47:18 | 125        ||  2470321 | 2470263 | 21.11.2013 21:47:13 | 125

Table M has about 1.4 million records, and Table N has about 0.9 million.

I've tried to join the two tables based on the below two queries. But it takes hours to execute and that isn't feasible if I have to run this on a daily basis.

INSERT INTO MN_RECON (
SELECT M.*,N.* FROM M FULL OUTER JOIN N ON
M.ANO=N.ANO AND M.BNO=N.BNO AND 
((M.TIMESTAMP=N.TIMESTAMP+5/86400) OR (M.TIMESTAMP=N.TIMESTAMP+6/86400))

INSERT INTO MN_RECON (
SELECT M.*,N.* FROM M FULL OUTER JOIN N ON
M.ANO=N.ANO AND M.BNO=N.BNO AND
(M.TIMESTAMP-N.TIMESTAMP IN (5/86400,6/86400)

When I run just the SELECT statement of the above 2 queries I get an output within a minute (just a few 100 sample lines) but with the INSERT added it takes a very long time. Is there a way to optimize what I want to do?

I need it to match on timestamp because there can be multiple occurrences of the same ANO - BNO combination during the day, with the timestamp being the unique identifier between them

And I need a full outer join because I need to focus on records that are not matched, as well as matched records with a difference in duration between the two tables.

Additional Oracle information Oracle Database 11g Enterprise Edition 11.2.0.3.0 64-bit Production

EXPLAIN PLAN

| Id  | Operation                | Name            | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |                 |  2386K|   530M|       |  2395M  (1)|999:59:59 |
|   1 |  LOAD TABLE CONVENTIONAL | MN_RECON        |       |       |       |            |          |
|   2 |   VIEW                   |                 |  2386K|   530M|       |  2395M  (1)|999:59:59 |
|   3 |    UNION-ALL             |                 |       |       |       |            |          |
|*  4 |     HASH JOIN RIGHT OUTER|                 |  1417K|   109M|    49M| 10143   (1)| 00:02:02 |
|   5 |      TABLE ACCESS FULL   | N_VOICE         |   968K|    38M|       |  1753   (1)| 00:00:22 |
|   6 |      TABLE ACCESS FULL   | M_VOICE         |  1417K|    52M|       |  2479   (1)| 00:00:30 |
|*  7 |     FILTER               |                 |       |       |       |            |          |
|   8 |      TABLE ACCESS FULL   | N_VOICE         |   968K|    38M|       |  1754   (1)| 00:00:22 |
|*  9 |      TABLE ACCESS FULL   | M_VOICE         |     1 |    29 |       |  2479   (1)| 00:00:30 |
Joshua1729
  • 775
  • 4
  • 10
  • 19
  • Have you tried: "(M.TIMESTAMP IN (I.TIMESTAMP+5/86400,I.TIMESTAMP+6/86400))"? Posting an explain plan is usually essential for query performance problems. – David Aldridge Nov 25 '13 at 07:37
  • Also what's your exact oracle version, edition and licensed options? – David Aldridge Nov 25 '13 at 07:38
  • Nope, I've not tried that yet, currently I'm trying this ABS(TO_CHAR(M.MSC_CALL_DATE,'SSSSS')-TO_CHAR(O.OCS_CALL_DATE, 'SSSSS')) > 5 and ABS(TO_CHAR(M.MSC_CALL_DATE,'SSSSS')-TO_CHAR(O.OCS_CALL_DATE, 'SSSSS')) < 7 Explain plan as in? Why I need this done? I'm trying to match two sources of call records to ensure completeness between the two i.e whatever is in source A is also in Source B. – Joshua1729 Nov 25 '13 at 07:39
  • The exact oracle version is Oracle Database 11g Enterprise Edition 11.2.0.3.0 64-bit Production – Joshua1729 Nov 25 '13 at 07:43
  • Why are you not interested in a difference of 4 seconds or less? – Twinkles Nov 25 '13 at 07:45
  • Because Source B lags behind Source A by a difference of 5.something seconds. – Joshua1729 Nov 25 '13 at 07:45
  • The explain plan tells you how the query is being executed -- full table scan, etc -- and gives an estimate of the amount of temp space required. – David Aldridge Nov 25 '13 at 07:48
  • I've added the explain plan - Learnt something new today – Joshua1729 Nov 25 '13 at 08:02
  • Do you have the partitioning option? – David Aldridge Nov 25 '13 at 09:30

1 Answers1

3

A simple way to speed up the query is by creating a function-based index:

CREATE INDEX indexname1 ON N (timestamp+5/86400);
CREATE INDEX indexname2 ON N (timestamp+6/86400);
Twinkles
  • 1,984
  • 1
  • 17
  • 31
  • I'm not too clear on how to use indexes (I've never used them before) should I run this before executing my join? and I keep my join conditions the same? – Joshua1729 Nov 25 '13 at 07:47
  • Indexes are the most important factor when performance is important for you. I recommend you look into them. – Twinkles Nov 25 '13 at 07:53
  • Since that seems to imply you have not a single index defined on either table, you should do: `CREATE INDEX indexname on M (ANO,BNO)` and `CREATE INDEX indexname on N (ANO,BNO)`. – Twinkles Nov 25 '13 at 07:55
  • Create the indexes before executing the query. No need to change the query itself. – Twinkles Nov 25 '13 at 07:56
  • Check if this query might be useful: `SELECT M.ANO,M.BNO,M.TIMESTAMP TIMESTAMP_M,n.TIMESTAMP TIMESTAMP_M,M.DURATION DURATION_M, N.DURATION DURATION_N, M.TIMESTAMP-N.TIMESTAMP FROM M INNER JOIN N ON (M.ANO=N.ANO AND M.BNO=N.BNO) – Twinkles Nov 25 '13 at 08:00
  • Thanks!!! The indexes worked !!! took 35 seconds to run!! I really need to read up on this index stuff! And I used my original query because I need a full outer join!! Thanks again! – Joshua1729 Nov 25 '13 at 08:18