UPDATE statement: re-evaluate subquery after each SET

Question

CREATE TABLE test (id NUMBER(3));

INSERT INTO test VALUES (1);
INSERT INTO test VALUES (2);
INSERT INTO test VALUES (2);
INSERT INTO test VALUES (3);
INSERT INTO test VALUES (4);
INSERT INTO test VALUES (4);
INSERT INTO test VALUES (5);

I want to make the numbers in test unique (like {1,2,3,4,5,6,7} - i.e. remove doubles) using this query:

UPDATE test u SET u.id = (SELECT max(nn.id) FROM test nn) + 1 
  WHERE 1 < (
    SELECT tt.rown 
      FROM (SELECT rowid, row_number() over ( partition by t.id order by t.id) AS rown FROM test t) tt
      WHERE tt.rowid = u.rowid
    );

The above query updates the table to {1,2,3,4,5,6,6}. It replaces the second 2 to 6 correctly, but the second 4 becomes 6 also, which should be 7. Existing non-duplicates should not be replaced, but only the second-onward duplicates.

The problem with the above update statement is that (SELECT max(nn.id) FROM test nn) is evaluated once and cached, but it actually depends on what is updated. Is it possible to force re-evaluation of (SELECT max(nn.id) FROM test nn) after each SET? I tried some hints like /*+NOCACHE*/ without success.

In other words, during update, you need to take into account the fields that already have been updated.

Anyone ideas?

I assume this could be solved with NON-DETERMINISTIC function, but I don't want to make functions. EDIT: if I try to compute the id with a function i get ORA-04091: table TEST is mutating, trigger/function may not see it. Using PRAGMA AUTONOMOUS_TRANSACTION; gives the same result as the above query.

Solution with precalculated ids, bypassing query re_evaluation

AudriyM solved this (cf. comments) for MS SQL Server 2008 using CTEs. There is no alternative in Oracle for CTEs as far as I know, but because AudriyM's solution was based on precalculated values for ids, I though I could translate it in Oracle with subqueries. And here it is:

UPDATE test u SET u.id = ( SELECT newIDs.newID
                           FROM ( SELECT  ranked.rowid,
                                          ranked.m + row_number() over (order by ranked.id, ranked.r) as newID 
                                  FROM ( SELECT t.rowid, t.id, row_number() over ( partition by t.id order by t.id) AS r, max(id) over() AS m 
                                        FROM test t ) ranked
                                  WHERE ranked.r > 1 ) newIDs
                           WHERE u.rowid = newIDs.rowid )
WHERE u.rowid IN ( SELECT ranked.rowid
                   FROM ( SELECT t.rowid, t.id, row_number() over ( partition by t.id order by t.id) AS r, max(id) over() AS m 
                          FROM test t ) ranked
                   WHERE ranked.r > 1 );

Solution without precalculated ids, using query re_evaluation

Still not found and the question remains unanswered.

I'm struggling to understand the point of this query. Please can you explain why you are using it? and what your expected outcome is? — sarin, Jun 02 '14 at 12:27
I have edited the question. I simply want to make each consequtive allready existant number equal to the maximum number + 1 in the table. The problem is the caching of `(SELECT max(nn.id) FROM test nn)` and I wonder whether this caching mechanism can be disabled. Is it even possible because the table is mutating? — Davor Josipovic, Jun 02 '14 at 12:30
This better be a one-off (or special maintenance) query, because in order to be thread-safe `SELECT MAX(...) + 1` style stuff has to lock the entire table. In order for this sort of thing to work, the db needs some way to tell the duplicated rows apart - is there some other column that's unique? Alternatively, do you need to keep matching rows - what about just [deleting duplicates](http://stackoverflow.com/questions/21067033/delete-duplicate-records-using-rownum-in-sql)? — Clockwork-Muse, Jun 02 '14 at 13:12
I cannot delete duplicates nor can I replace the numbers of existing non-duplicates. I may also not replace the number of the first duplicate. This is more a question about algorithm than anything else. — Davor Josipovic, Jun 02 '14 at 13:45
What do you mean by *first duplicate*? Is there a way to tell the first instance from any other? — Andriy M, Jun 02 '14 at 16:27
@AndriyM There is, but for the essence of the example, it doesn't matter. The default SQL engine ordering is good enough. And I am not so much interested in an other query that can solve this - I think I can make one to do that already. I am more interested in how to force the SQL engine to re-evaluate `(SELECT max(nn.id) FROM test nn)` on each `SET`. Or better: is it even possible to force it to re-evaluate? — Davor Josipovic, Jun 02 '14 at 17:19
Oh, I see. Personally, I wouldn't take this direction, but then I'm not an Oracle man either. I mean, I know that I would definitely solve this problem differently in SQL Server, and that way is unavailable in Oracle (but I was asking because I was trying to find a more or less elegant equivalent). — Andriy M, Jun 03 '14 at 05:27
@AudriyM You can share your thoughts. I am interested in 'how differently'. No need for code. — Davor Josipovic, Jun 03 '14 at 05:49
Posting as a comment as I haven't yet figured out how to translate this into Oracle: http://sqlfiddle.com/#!3/b2634/2 That uses the concept of updatable CTEs (WITH clauses). Perhaps using ROWIDs could help in some way, I'm just not sure how to both filter out the rows that must not change *and* modify the rest correctly. — Andriy M, Jun 03 '14 at 06:13
@AndriyM I translated your query in Oracle and updated the question. — Davor Josipovic, Jun 06 '14 at 18:22
Please consider posting your solution as an answer, so that you can formally accept it eventually (once you feel its time to close the quest[ion]). — Andriy M, Jun 07 '14 at 07:33

score 0 · Answer 1 · answered Jun 02 '14 at 13:19

0

Try to use the following statement. Your table doesn't allow to identify a row for example with ID=2 so to link table with ROW_NUMBER you can use ROWID

UPDATE
Test 
SET id = (SELECT RN FROM
                     (SELECT ROW_NUMBER() 
                             OVER (ORDER BY ID) as RN 
                             FROM Test
                     ) T1
                     WHERE T1.RowID=Test.RowID
          )

SQLFiddle demo

answered Jun 02 '14 at 13:19

valex

23,966
7
43
60

You also could have used `ROWNUM` here too. Still, the problem persists: existing non-duplicates should remain unchanged, and only the second-onward duplicates should be replaced. I would stick to the original SQL and concentrate on the question whether `(SELECT max(nn.id) FROM test nn)` can be re-evaluated after each SET operation. – Davor Josipovic Jun 02 '14 at 13:31
But the second `2` has to be replaced to `3` because it is before next values. But in this case what should you do with the next row `3` to avoid duplication of `3`? It's not clear. – valex Jun 02 '14 at 13:34
No, the second `2` should be replaced to `(SELECT max(nn.id) FROM test nn)` as stated in the original query. The original query really does what it should. The only problem is the caching of `(SELECT max(nn.id) FROM test nn)` – Davor Josipovic Jun 02 '14 at 13:36

UPDATE statement: re-evaluate subquery after each SET

Solution with precalculated ids, bypassing query re_evaluation

Solution without precalculated ids, using query re_evaluation

1 Answers1

Linked