0

I have a big partitioned table to update (some billions of records and about 95M rows to update). Another lookup table has about 10M IDs that identify which rows to update in the big table mentioned above.

The following query used to identify partitions in the big table that have records to update:

SELECT /*+ parallel(8) */
dict.subobject_name
  FROM (SELECT DISTINCT dbms_rowid.rowid_object (trg.ROWID) data_object_id
          FROM lookup_table lkp INNER JOIN big_table trg ON lkp.id = trg.id) subq
INNER JOIN all_objects dict ON subq.data_object_id = dict.data_object_id;

DBMS_ROWID utility uses embedded ROWID functionality to get the segment identifier. So, this segment contains rows to update after joining with the lookup table. Then it joined with the ALL_OBJECTS data dictionary to get the connected partitions of the big table. As a result, I get 3 from 30 partitions to operate for partition exchange, but not all the partitions.

On the big table it took about 14 min to execute. Small table with 290K records to update it took about 53 sec. But the point is that it will run on about a thousand of tables and be executed for each of them in the FOR LOOP.

Do not think that EXISTS or IN will change the situation dramatically, but will check.

Any thoughts about this query optimization?

  • Sample data, desired results, and a clear explanation of what the code should be doing would help. – Gordon Linoff Jun 30 '21 at 13:45
  • @GordonLinoff tried to do my best and added some details about dbms_rowid and desired result – Yauheni Khvainitski Jun 30 '21 at 13:52
  • Please describe your goal clearly. Do you want to update *one* partition table or 1000 of them? Or do you want only to identify the partitions of one or thousands tables? – Marmite Bomber Jun 30 '21 at 16:29
  • Is the large table *range partitioned*? Did you try to partition the small table with the same schema? – Marmite Bomber Jun 30 '21 at 16:44
  • What is the purpose of identifying the partition of specific row by joining the table with that row? You've already get the rowid, just use it for `merge` statement. How partition identification can help here? – astentx Jun 30 '21 at 17:13
  • Well the OP *apparently* tries to process only the partitions that contains the changed data @astentx, but he failed to provide details. – Marmite Bomber Jun 30 '21 at 19:17
  • @MarmiteBomber I just need to optimize the query to identify needed partitions. Lookup table stores IDs that will identify rows in a big target table to update. Leave out anything else I just need to optimize mentioned query – Yauheni Khvainitski Jun 30 '21 at 19:30
  • You still did not explain why do you mean you will repeat the query thousand times, but you will need provide some additional [information](https://stackoverflow.com/a/34975420/4808122) so we can assist with the query tuning. – Marmite Bomber Jul 01 '21 at 08:09
  • Sounds like you're reinventing the wheel. A join between a non-partitioned table and partitioned one will typically use a bloom filter which in effect does what you are doing manually, ie, scan the small table, create a bloom filter to locater partitions that are needed in the big table. Updating a join between the table tables will probably do the trick just as well – Connor McDonald Jul 02 '21 at 01:57

0 Answers0