Looking at the code running in concurrent sessions environment, after each atomic statement we need to ask "what if another session have just broken our assumption?" and make adjustments according to that.
Option 1. Count and decide INSERT or UPDATE
declare
v_count int;
begin
SELECT count(1) INTO v_count FROM my_table WHERE ...;
IF v_count = 0 THEN
-- what if another session inserted the same row just before this point?
-- this statement will fail
INSERT INTO my_table ...;
ELSE
UPDATE my_table ...;
END IF;
end;
Option 2. UPDATE, if nothing is updated - INSERT
begin
UPDATE my_table WHERE ...;
IF SQL%COUNT = 0 THEN
-- what if another session inserted the same row just before this point?
-- this statement will fail
INSERT INTO my_table ...;
END IF;
end;
Option 3. INSERT, if failed - UPDATE
begin
INSERT INTO my_table ...;
exception when DUP_VAL_ON_INDEX then
-- what if another session updated the same row just before this point?
-- this statement will override previous changes
-- what if another session deleted this row?
-- this statement will do nothing silently - is it satisfactory?
-- what if another session locked this row for update?
-- this statement will fail
UPDATE my_table WHERE ...;
end;
Option 4. use MERGE
MERGE INTO my_table
WHEN MATCHED THEN UPDATE ...
WHEN NOT MATCHED THEN INSERT ...
-- We have no place to put our "what if" question,
-- but unfortunately MERGE is not atomic,
-- it is just a syntactic sugar for the option #1
Option 5. use interface for DML on my_table
-- Create single point of modifications for my_table and prevent direct DML.
-- For instance, if client has no direct access to my_table,
-- use locks to guarantee that only one session at a time
-- can INSERT/UPDATE/DELETE a particular table row.
-- This could be achieved with a stored procedure or a view "INSTEAD OF" trigger.
-- Client has access to the interface only (view and procedures),
-- but the table is hidden.
my_table_v -- VIEW AS SELECT * FROM my_table
my_table_ins_or_upd_proc -- PROCEDURE (...) BEGIN ...DML on my_table ... END;
PROCEDURE my_table_ins_or_upd_proc(pi_row my_table%ROWTYPE) is
l_lock_handle CONSTANT VARCHAR2(100) := 'my_table_' || pi_row.id;
-- independent lock handle for each id allows
-- operating on different ids in parallel
begin
begin
request_lock(l_lock_handle);
-->> this code is exactly as in option #2
UPDATE my_table WHERE ...;
IF SQL%COUNT = 0 THEN
-- what if another session inserted the same row just before this point?
-- NOPE it cannot happen: another session is waiting for a lock on the line # request_lock(...)
INSERT INTO my_table ...;
END IF;
--<<
exception when others then
release_lock(l_lock_handle);
raise;
end;
release_lock(l_lock_handle);
end;
Not going too deep into low level details here, see this article to find out how to use locks in Oracle DBMS.
Thus, we see that options 1,2,3,4 have potential problems that cannot be avoided in a general case. But they could be applied if the safety is guaranteed by domain rules or a particular design conventions.
Option 5 is bulletproof and fast as it is relies on the DBMS contracts.
Nevertheless, this will be a prize for clean design, and it cannot be implemented if my_table
is barenaked and clients rely on straightforward DML on this table.
I believe that performance is less important than data integrity, but let's mention that for completeness.
After proper consideration it is easy to see that the options order according to the "theoretical" average performance is:
2 -> 5 -> (1,4) -> 3
Of course, the step of performance measuring goes after obtaining at least two properly working solutions, and should be done exclusively for a particular application under a given workload profile. And that is another story. At this moment no need to bother about theoretical nanoseconds in some synthetic benchmarks.
I guess currently we see that there will be no magic. Somewhere in the application it is required to ensure that every id
inserted into my_table
is unique.
If id values do not matter (95% of cases) - just go for using a SEQUENCE.
Otherwise, create a single point of manipulation on my_table
(either in Java or in DBMS schema PL/SQL) and control the uniqueness there. If the application can guarantee that at most a single session at a time manipulates data in my_table
, then it is possible to just apply the option #2.