1

I am running a spring boot application which will read a record/row in Oracle database, process/transform it, save the transformed data into a secondary database of no-sql type. My requirement is to keep track of the record/row which has already been processed by the application. I have following constraints:

  1. The data is huge and processing all the records in one go is not possible.
  2. Adding a column (maybe a boolean flag to indicate whether the row is processed or not) to the database for keeping track is not an option.
Rax
  • 323
  • 1
  • 5
  • 17
  • The purpose of this application is to migrate the data from primary(oracle database having billions of records) to secondary (no-sql database) database. – Rax Dec 09 '19 at 08:30
  • 1
    What about adding table with column related to primary key column of table in question and column for flag? – Romeo Ninov Dec 09 '19 at 09:07

1 Answers1

1

I think you might want to consider the ROWID Pseudocolumn, depending on what is currently happening on the database. ROWID shall be unique for the row unless you're doing specific operations to the table: What can cause an Oracle ROWID to change?

  • Export/import of the table
  • ALTER TABLE XXXX MOVE
  • ALTER TABLE XXXX SHRINK SPACE
  • FLASHBACK TABLE XXXX
  • Splitting a partition
  • Updating a value so that it moves to a new partition
  • Combining two partitions
  • In case of IOT an update to the primary key would give you a different ROWID as well.

In case that none of those operations are taking place, you might to select the record with it's rowid, store it to your second database (or anywhere) and always check, if the record was already processed. If you don't know what might be happening to the database, then creating new table with list of processed rows referenced by the primary key of the main table seems to be good idea. Consider creating the second table as Index-organized table, as it will be growing, it might save you a lot of unnecessary row access by rowid.