1

I am currently working on a internal legacy project and to get more familiar with the backend logic and database design, I have decided to rewrite all queries from the old fashion SQL-86 standard into a more modern one. But I am stuck on one big view and one participial join (this example is very simplified):

select t4.id
from
    t1,
    t2,
    t3,
    t4
where 
    t2.t1id = t1.id(+)
    and t2.tmpid = t3.tmpid(+)
    and t4.id = t3.t4id(+)
group by t4.id 
having
    count (t2.t1id) = count(t3.tmpid);

which is based on the following (example) table structure:

create table t1 (
    id number generated by default on null as identity primary key,
    title varchar(100)
);

create table t2 (
    t1id number not null,
    tmpid number not null
);

create table t3 (
    tmpid number not null,
    t4id number not null
);

create table t4 (
    id number generated by default on null as identity primary key,
    title varchar(100)
);

insert into t1(id, title) values(1, 'Order1');
insert into t1(id, title) values(2, 'Order2');

insert into t4(id, title) values(1, 'Attribute 1');
insert into t4(id, title) values(2, 'Attribute 2');

insert into t2(t1id, tmpid) values(1, 1);
insert into t2(t1id, tmpid) values(1, 2);
insert into t2(t1id, tmpid) values(2, 1);
insert into t2(t1id, tmpid) values(2, 2);

insert into t3(tmpid, t4id) values(1, 1);
insert into t3(tmpid, t4id) values(2, 2);
insert into t3(tmpid, t4id) values(2, 1);

The query in question results in the following result when removing the "group by" and "having" statement:

select *
from
    t1,
    t2,
    t3,
    t4
where 
    t2.t1id = t1.id(+)
    and t2.tmpid = t3.tmpid(+)
    and t4.id = t3.t4id(+)
order by t1.id

enter image description here

First of all I thought the and t4.id = t3.t4id(+) was just a "wrong" (well it works for over a decade now so it can't really be wrong) specified old fashioned join and just replaced it with a and t3.t4id = t4.id(+) so it would be a "left join" on ANSI SQL, but then the rows with a "null" values just disappears and the having count(t2.t1id) = count(t3.tmpid) results to true on every row.

So my question is to what resolves the and t4.id = t3.t4id(+) in a ANSI SQL and why? Out of desperation I have tried it with every join and I can't reproduce the null values which are mandatory for the "having".

Or is there is there another way to implement this query in a modern way?

  • 1
    It is a very good idea to re-write the joins. It would be even better to understand what the queries are supposed to do and write them completely anew. With window functions, `FETCH FIRST n ROWS`, lateral joins, etc. now available, some queries may become a lot easier to read and maintain. But going from the old Oracle joins to standard SQL joins is already a great step. Good luck with this. – Thorsten Kettner Feb 26 '23 at 19:41
  • 1
    @ThorstenKettner yes that's the actual goal. Of course I want to understand the logical behavior of the queries in the big picture to get a understanding of the dependencies and to improve performance nerverless. – audacity363 Feb 26 '23 at 19:56
  • why? I always prefer Oracle style using (+) – alvalongo Feb 28 '23 at 18:49

1 Answers1

3

There are two tables joined with (+): t1 and t3. So what you have is

from t2
cross join t4
left join t1 on t1.id = t2.t1id
left join t3 on t3.tmpid = t2.tmpid
             and t3.t4id = t4.id

In your create table statements t2.t1id is a not null column. With a proper foreign key constraint, t2.t1id would always have a parent row match in t1.id and the t1 join would just be an inner join really.

If you are working with Oracle's SQL Developer, you can have the program re-write the queries from old Oracle proprietary joins to ANSI joins by the way. See thatjeffsmith's answer here: https://stackoverflow.com/a/69284420/2270762.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • Wow thx. This was actually easy... Unfortunately the refactoring function of the sqldeveloper mostly says "Text is not a valid, single SELECT-Statement". "With a proper foreign key constraint,..." - This was just me in the test definition. In the real environment there are actually no constraints. Just a "number" field... – audacity363 Feb 26 '23 at 19:45
  • One thing I should mention: The only drawback I know of using SQL Developer for the automatic re-write is that the tool mends incomplete left outer joins. So `from a, b where a.x = b.x(+) and a.y = b.y` becomes `from a left outer join b on a.x = b.x and a.y = b.y` where it originally was `from a inner join b on a.x = b.x and a.y = b.y`, because y was not joined with (+). – Thorsten Kettner Feb 26 '23 at 19:46
  • 1
    Sorry to hear this about the rewrite function. I've only tried it once or twice with simple SQL. I always re-write from scratch without the tool :-) – Thorsten Kettner Feb 26 '23 at 20:08